Skip to content

[clang] [ARM] Explicitly enable NEON for Windows/Darwin targets #122095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 16, 2025

Conversation

mstorsjo
Copy link
Member

@mstorsjo mstorsjo commented Jan 8, 2025

Upstream LLVM implicitly enables NEON for any ARMv7 target.

Many platform ABIs with an ARMv7 baseline also include NEON in that, this is the case on e.g. Windows and iOS. On Linux, however, things are not quite as clearly defined. Some distributions target an ARMv7 baseline without NEON available (this is the case for e.g. Debian/Ubuntu for the "armhf" architecture).

To achieve this, Debian/Ubuntu patch LLVM downstream to make ARMv7 only implicitly enable VPFv3-D16, not NEON - see [1].

That patch has the (unintended) effect that NEON no longer is available by default for Windows/ARMv7 and iOS/ARMv7.

In practice, when compiling C for Windows/ARMv7, NEON actually still is available, but not when compiling assembly files. This is due to ARM::getARMCPUForArch (in
llvm/lib/TargetParser/ARMTargetParser.cpp) returning "cortex-a9" for Windows. This difference, between C and assembly, is due to how getARMTargetCPU is called in getARMTargetFeatures (in clang/lib/Driver/ToolChains/Arch/ARM.cpp) to get defaults, only when ForAS is not set - see [2].

There is an existing case in getARMTargetFeatures, for Android, which explicitly enables NEON when targeting ARM >= v7. As Windows and iOS have NEON as part of their ABI baseline just like Android does these days (see [3] for where this default was added for Android), add the implicit default in a similar way.

However, first do the general lookup of getARMTargetCPU (unless ForAS); this makes sure that we get the same default CPU as before ("cortex-a9" for Windows and "swift" for the "armv7s" architecture on Darwin).

[1] https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/19/debian/patches/clang-arm-default-vfp3-on-armv7a.patch?ref_type=heads
[2] b8baa2a
[3] d0fbef9

Upstream LLVM implicitly enables NEON for any ARMv7 target.

Many platform ABIs with an ARMv7 baseline also include NEON in that,
this is the case on e.g. Windows and iOS. On Linux, however, things
are not quite as clearly defined. Some distributions target an ARMv7
baseline without NEON available (this is the case for e.g.
Debian/Ubuntu for the "armhf" architecture).

To achieve this, Debian/Ubuntu patch LLVM downstream to make
ARMv7 only implicitly enable VPFv3-D16, not NEON - see [1].

That patch has the (unintended) effect that NEON no longer is
available by default for Windows/ARMv7 and iOS/ARMv7.

In practice, when compiling C for Windows/ARMv7, NEON actually
still is available, but not when compiling assembly files.
This is due to ARM::getARMCPUForArch (in
llvm/lib/TargetParser/ARMTargetParser.cpp) returning "cortex-a9"
for Windows. This difference, between C and assembly, is due to
how getARMTargetCPU is called in getARMTargetFeatures (in
clang/lib/Driver/ToolChains/Arch/ARM.cpp) to get defaults, only
when ForAS is not set - see [2].

There is an existing case in getARMTargetFeatures, for Android,
which explicitly enables NEON when targeting ARM >= v7. As
Windows and iOS have NEON as part of their ABI baseline just like
Android does these days (see [3] for where this default was
added for Android), add the implicit default in a similar way.

However, first do the general lookup of getARMTargetCPU (unless
ForAS); this makes sure that we get the same default CPU
as before ("cortex-a9" for Windows and "swift" for the "armv7s"
architecture on Darwin).

[1] https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/19/debian/patches/clang-arm-default-vfp3-on-armv7a.patch?ref_type=heads
[2] llvm@b8baa2a
[3] llvm@d0fbef9
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jan 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 8, 2025

@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Martin Storsjö (mstorsjo)

Changes

Upstream LLVM implicitly enables NEON for any ARMv7 target.

Many platform ABIs with an ARMv7 baseline also include NEON in that, this is the case on e.g. Windows and iOS. On Linux, however, things are not quite as clearly defined. Some distributions target an ARMv7 baseline without NEON available (this is the case for e.g. Debian/Ubuntu for the "armhf" architecture).

To achieve this, Debian/Ubuntu patch LLVM downstream to make ARMv7 only implicitly enable VPFv3-D16, not NEON - see [1].

That patch has the (unintended) effect that NEON no longer is available by default for Windows/ARMv7 and iOS/ARMv7.

In practice, when compiling C for Windows/ARMv7, NEON actually still is available, but not when compiling assembly files. This is due to ARM::getARMCPUForArch (in
llvm/lib/TargetParser/ARMTargetParser.cpp) returning "cortex-a9" for Windows. This difference, between C and assembly, is due to how getARMTargetCPU is called in getARMTargetFeatures (in clang/lib/Driver/ToolChains/Arch/ARM.cpp) to get defaults, only when ForAS is not set - see [2].

There is an existing case in getARMTargetFeatures, for Android, which explicitly enables NEON when targeting ARM >= v7. As Windows and iOS have NEON as part of their ABI baseline just like Android does these days (see [3] for where this default was added for Android), add the implicit default in a similar way.

However, first do the general lookup of getARMTargetCPU (unless ForAS); this makes sure that we get the same default CPU as before ("cortex-a9" for Windows and "swift" for the "armv7s" architecture on Darwin).

[1] https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/19/debian/patches/clang-arm-default-vfp3-on-armv7a.patch?ref_type=heads
[2] b8baa2a
[3] d0fbef9


Full diff: https://github.com/llvm/llvm-project/pull/122095.diff

3 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Arch/ARM.cpp (+8)
  • (modified) clang/test/Driver/arm-mfpu.c (+4-2)
  • (modified) clang/test/Preprocessor/arm-target-features.c (+27)
diff --git a/clang/lib/Driver/ToolChains/Arch/ARM.cpp b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
index b8181ce6dc012a..9e6fe4f60bb65d 100644
--- a/clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -659,13 +659,21 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
         CPUArgFPUKind != llvm::ARM::FK_INVALID ? CPUArgFPUKind : ArchArgFPUKind;
     (void)llvm::ARM::getFPUFeatures(FPUKind, Features);
   } else {
+    bool Generic = true;
     if (!ForAS) {
       std::string CPU = arm::getARMTargetCPU(CPUName, ArchName, Triple);
+      if (CPU != "generic")
+        Generic = false;
       llvm::ARM::ArchKind ArchKind =
           arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
       FPUKind = llvm::ARM::getDefaultFPU(CPU, ArchKind);
       (void)llvm::ARM::getFPUFeatures(FPUKind, Features);
     }
+    if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) &&
+        getARMSubArchVersionNumber(Triple) >= 7) {
+      FPUKind = llvm::ARM::parseFPU("neon");
+      (void)llvm::ARM::getFPUFeatures(FPUKind, Features);
+    }
   }
 
   // Now we've finished accumulating features from arch, cpu and fpu,
diff --git a/clang/test/Driver/arm-mfpu.c b/clang/test/Driver/arm-mfpu.c
index 1b174be388124d..669e7975675fa5 100644
--- a/clang/test/Driver/arm-mfpu.c
+++ b/clang/test/Driver/arm-mfpu.c
@@ -356,8 +356,10 @@
 // CHECK-HF-DAG: "-target-cpu" "arm1176jzf-s"
 
 // RUN: %clang -target armv7-apple-darwin -x assembler %s -### -c 2>&1 \
-// RUN:   | FileCheck --check-prefix=ASM %s
-// ASM-NOT: -target-feature
+// RUN:   | FileCheck --check-prefix=ASM-NEON %s
+// RUN: %clang -target armv7-windows -x assembler %s -### -c 2>&1 \
+// RUN:   | FileCheck --check-prefix=ASM-NEON %s
+// ASM-NEON: "-target-feature" "+neon"
 
 // RUN: %clang -target armv8-linux-gnueabi -mfloat-abi=soft -mfpu=none %s -### -c 2>&1 \
 // RUN:   | FileCheck --check-prefix=CHECK-SOFT-ABI-FP %s
diff --git a/clang/test/Preprocessor/arm-target-features.c b/clang/test/Preprocessor/arm-target-features.c
index 2999ee0d9e4d80..95ca7d0cbc3c2a 100644
--- a/clang/test/Preprocessor/arm-target-features.c
+++ b/clang/test/Preprocessor/arm-target-features.c
@@ -132,6 +132,30 @@
 // CHECK-V7VE-DEFAULT-ABI-SOFT: #define __ARM_ARCH_EXT_IDIV__ 1
 // CHECK-V7VE-DEFAULT-ABI-SOFT: #define __ARM_FP 0xc
 
+// RUN: %clang -target x86_64-apple-macosx10.10 -arch armv7 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-DARWIN-V7 %s
+// CHECK-DARWIN-V7: #define __ARMEL__ 1
+// CHECK-DARWIN-V7: #define __ARM_ARCH 7
+// CHECK-DARWIN-V7: #define __ARM_ARCH_7A__ 1
+// CHECK-DARWIN-V7-NOT: __ARM_FEATURE_CRC32
+// CHECK-DARWIN-V7-NOT: __ARM_FEATURE_NUMERIC_MAXMIN
+// CHECK-DARWIN-V7-NOT: __ARM_FEATURE_DIRECTED_ROUNDING
+// CHECK-DARWIN-V7: #define __ARM_FP 0xc
+// CHECK-DARWIN-V7: #define __ARM_NEON 1
+// CHECK-DARWIN-V7: #define __ARM_NEON_FP 0x4
+// CHECK-DARWIN-V7: #define __ARM_NEON__ 1
+
+// RUN: %clang -target armv7-windows -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-WINDOWS-V7 %s
+// CHECK-WINDOWS-V7: #define __ARMEL__ 1
+// CHECK-WINDOWS-V7: #define __ARM_ARCH 7
+// CHECK-WINDOWS-V7: #define __ARM_ARCH_7A__ 1
+// CHECK-WINDOWS-V7-NOT: __ARM_FEATURE_CRC32
+// CHECK-WINDOWS-V7-NOT: __ARM_FEATURE_NUMERIC_MAXMIN
+// CHECK-WINDOWS-V7-NOT: __ARM_FEATURE_DIRECTED_ROUNDING
+// CHECK-WINDOWS-V7: #define __ARM_FP 0xe
+// CHECK-WINDOWS-V7: #define __ARM_NEON 1
+// CHECK-WINDOWS-V7: #define __ARM_NEON_FP 0x6
+// CHECK-WINDOWS-V7: #define __ARM_NEON__ 1
+
 // RUN: %clang -target x86_64-apple-macosx10.10 -arch armv7s -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V7S %s
 // CHECK-V7S: #define __ARMEL__ 1
 // CHECK-V7S: #define __ARM_ARCH 7
@@ -140,6 +164,9 @@
 // CHECK-V7S-NOT: __ARM_FEATURE_NUMERIC_MAXMIN
 // CHECK-V7S-NOT: __ARM_FEATURE_DIRECTED_ROUNDING
 // CHECK-V7S: #define __ARM_FP 0xe
+// CHECK-V7S: #define __ARM_NEON 1
+// CHECK-V7S: #define __ARM_NEON_FP 0x6
+// CHECK-V7S: #define __ARM_NEON__ 1
 
 // RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=soft -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s
 // RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=softfp -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s

@mstorsjo
Copy link
Member Author

mstorsjo commented Jan 8, 2025

Admittedly, I'm not sure if this patch really is relevant for upstream or not, as it doesn't really affect observable behaviour much at all. The main observable difference on the clang driver level is that when compiling assembly, -target-feature arguments are now passed on the -cc1as command line while there previously weren't passed any (as the implicit default was correct); this is the same as the added tests in d0fbef9. However, if https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/19/debian/patches/clang-arm-default-vfp3-on-armv7a.patch is applied, like in downstream Debian/Ubuntu distributions, this patch restores functionality for Windows/Darwin targets.

If it isn't seemed relevant/suitable for upstream, I'd request @sylvestre to pick it up and integrate into that referenced patch for future Debian/Ubuntu releases. (But I'm not sure if there are other Linux Clang downstream distributions that apply similar patches?)

@DavidSpickett
Copy link
Collaborator

In practice, when compiling C for Windows/ARMv7, NEON actually still is available, but not when compiling assembly files.

So the "bug" if we're going to call it that, is that C code can produce Neon but assembly files compiled in the same way cannot use Neon instructions?

Or has it always worked for C and assembly, and this only changes how the Neon feature is communicated?

@DavidSpickett
Copy link
Collaborator

Or is it that this change allows other downstreams to patch the Linux issue in a way that does not break Windows and iOS in the process?

If I'm using clang on Debian, that's where I would find this C can use Neon but assembly cannot problem.

@mstorsjo
Copy link
Member Author

mstorsjo commented Jan 8, 2025

In practice, when compiling C for Windows/ARMv7, NEON actually still is available, but not when compiling assembly files.

So the "bug" if we're going to call it that, is that C code can produce Neon but assembly files compiled in the same way cannot use Neon instructions?

Or has it always worked for C and assembly, and this only changes how the Neon feature is communicated?

With upstream Clang, both C and assembly files can use Neon, for both Windows/armv7 and iOS/armv7 targets. With the Debian/Ubuntu patched Clang, Neon only works in C, for Windows/armv7, but not in assembly and not for iOS/armv7 in either form.

@mstorsjo
Copy link
Member Author

mstorsjo commented Jan 8, 2025

Or is it that this change allows other downstreams to patch the Linux issue in a way that does not break Windows and iOS in the process?

Exactly - this makes it more explicit upstream, to make it easier to tweak for downstreams for Linux targets, without breaking the other OS targets.

If I'm using clang on Debian, that's where I would find this C can use Neon but assembly cannot problem.

Yes, exactly.

On Ubuntu 24.04:

$ cat neon.s
vadd.i8 d0, d0, d0
$ cat neon.c
void add(void) {
  __asm__ __volatile__("vadd.i8 d0, d0, d0");
}
$ which clang
/usr/bin/clang
$ clang --version
Ubuntu clang version 18.1.3 (1ubuntu1)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang -target armv7-windows -c neon.c # succeeds
$ clang -target armv7-windows -c neon.s
neon.s:1:1: error: invalid instruction, any one of the following would fix this:
vadd.i8 d0, d0, d0
^
neon.s:1:1: note: instruction requires: NEON
vadd.i8 d0, d0, d0
^
neon.s:1:5: note: invalid operand for instruction
vadd.i8 d0, d0, d0
    ^
$ clang -target armv7-apple-darwin -c neon.c
neon.c:2:23: error: invalid instruction, any one of the following would fix this:
    2 |         __asm__ __volatile__("vadd.i8 d0, d0, d0");
      |                              ^
<inline asm>:1:2: note: instantiated into assembly here
    1 |         vadd.i8 d0, d0, d0
      |         ^
neon.c:2:23: note: instruction requires: NEON
    2 |         __asm__ __volatile__("vadd.i8 d0, d0, d0");
      |                              ^
<inline asm>:1:2: note: instantiated into assembly here
    1 |         vadd.i8 d0, d0, d0
      |         ^
neon.c:2:23: note: invalid operand for instruction
    2 |         __asm__ __volatile__("vadd.i8 d0, d0, d0");
      |                              ^
<inline asm>:1:6: note: instantiated into assembly here
    1 |         vadd.i8 d0, d0, d0
      |             ^
1 error generated.
$ clang -target armv7-apple-darwin -c neon.s
neon.s:1:1: error: invalid instruction, any one of the following would fix this:
vadd.i8 d0, d0, d0
^
neon.s:1:1: note: instruction requires: NEON
vadd.i8 d0, d0, d0
^
neon.s:1:5: note: invalid operand for instruction
vadd.i8 d0, d0, d0
    ^

This obviously is a downstream issue - but I'm trying to make it more explicit upstream, to make it easier for the downstreams that do patch it.

@DavidSpickett
Copy link
Collaborator

Setting default features for a platform based on what we happened to choose (likely with a very thin justification) to be the default CPU sounds fragile to me, so I think this patch is a good step forward. Even if there weren't this downstream patch interacting with it, I think it would be a good change to make.

This obviously is a downstream issue - but I'm trying to make it more explicit upstream, to make it easier for the downstreams that do patch it.

I wouldn't have expected the Debian patch to have these consequences either, so I agree with doing this.

@mstorsjo
Copy link
Member Author

mstorsjo commented Jan 8, 2025

Setting default features for a platform based on what we happened to choose (likely with a very thin justification) to be the default CPU sounds fragile to me, so I think this patch is a good step forward. Even if there weren't this downstream patch interacting with it, I think it would be a good change to make.

Indeed. And it follows in the same style as d0fbef9, which I in retrospect feel would have had very little tangible effect. CC @DanAlbert for that one, to potentially fill in more details about the reasoning around it.

@DanAlbert
Copy link
Member

It's over 4 years old so unfortunately I'm probably not much help beyond what's already written on the old review/commit. At the time we were still using GAS though, so perhaps that made it more meaningful? If vfpv3 was already the default FPU for armv7 at the time then it's probable that I just misunderstood.

if (!ForAS) {
std::string CPU = arm::getARMTargetCPU(CPUName, ArchName, Triple);
if (CPU != "generic")
Generic = false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I do feel that this logic where we explicitly look for "generic" is a bit kludgey. It would be slightly neater to just stick this into the existing Android case above (at lines 650-655) - but then the default-neon case overrides arch/os default CPUs from getLLVMArchKindForARM, which shows up as different output for %clang -target x86_64-apple-macosx10.10 -arch armv7s in arm-target-features.c below (where the default cpu for armv7s includes fp16 support).

I'm open to opinions on what's the least ugly compromise here. :-)

@mstorsjo
Copy link
Member Author

Ping - anyone want to give a proper review of this one? Both for the concept, but I'm also open to suggestions for making the actual implementation slightly less ugly.

llvm::ARM::ArchKind ArchKind =
arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
FPUKind = llvm::ARM::getDefaultFPU(CPU, ArchKind);
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
}
if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the logic for the "generic" check is that clang -target windows will be using cpu generic, fine. If you were specifying the CPU of a real target device, then presumably that would have Neon and Clang would know that anyway.

Only if you're doing something strange with a CPU that's never had iOS or Windows run on it, only then does this code not apply. But given that it's a strange use case, it's best to leave it alone and just take the user's intent literally.

Correct?

Also I note that Android does not do this generic CPU check. I don't think making it do that should be part of this PR, and so, while I think this code is spaghetti, I wouldn't try to merge them here. Probably wouldn't break anything, but probably isn't zero chance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost; clang -target armv7-apple-darwin will return "generic" above, and we'll need to add in NEON explicitly to it. However armv7-windows will get a default CPU of cortex-a9 which has NEON (and more). This case isn't necessarily something we need to preserve much about, but a case of armv7s-apple-darwin does get hit by one of the testcases, and armv7s does also get a specific CPU (swift) which also has got NEON and FP16 - and the distinction from that is picked up by an existing testcase.

So only for the cases where arm::getARMTargetCPU() doesn't return a specific CPU for this arch/OS target combo, we hit the "generic" case, where we want to add in reasonable defaults.

Yes, the fact that Android has its own check further above, means that it doesn't hit the arm::getARMTargetCPU() case, and doesn't pick up defaults from there (where it can add more core specific things like FP16 etc) - and indeed, I'm trying to avoid changing anything wrt that, even if it probably could work with this new codepath as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mstorsjo, this breaks Darwin when compiling assembly with armv7s which is neon-vfp4. vfp4 instructions (e.g. vfma) are no longer recognized because you effectively downgraded neon-vfp4 to neon. When ForAS you assume a "generic" CPU unconditionally rather than "swift".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @anemet, sorry about this! While I did test some combinatons with armv7s (as exposed by other testcases), I clearly missed this case.

I tried poking around for different ways of fixing this, but they all more or less amount to the fix as is currently being done in #130623, to remove the ForAS check here. (I also felt that that condition was problematic here, but I was afraid that changing that would be a bigger policy change.)

What do you (and @davemgreen, @DavidSpickett an @Stylie777) think here, what's the best course of action for the current 20.x release branch? Should we split out the change to remove the ForAS condition from the rest of the changes in #130623, making that a smaller separate step which can be backported?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see folks being very happy about backporting anything that would add a potentially breaking change, even if it does fix another issue in the process.

If you were to split out the ForAS change, would that be targeted enough to fix @anemet 's issue, or would that backport be just as impactful as #134366 ?

(we can more likely justify a "worse" fix on 20.x given that main will get a comprehensive fix)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+    bool Generic = "generic";

Did you try this as bool Generic = CPU == "generic"? I am not 100% what the behaviour as it is in the diff would work, I feel that would set Generic to always be true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, that was a typo as I recreated the diff I used for testing (later I did find the one I actually did test with); yes I did use bool Generic = CPU == "generic"; for my testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, thought that might be the case 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the minimal fix in #134366 isn't really backportable, I'm not sure if I see many other reasonable ways to fix this breakage, so then we probably should revert this change for the 20.x branch. If we can land #130623 before 21.x (or at least the split out #134366) then both this and the armv7s/vfpv4 issue should be fixed for 21.x. And this PR in itself only attempted to fix things for downstream-patched Clang, so it's not strictly essential anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That PR is exactly the split out form of the targeted fix for @anemet's issue - but it does change whether FPU features are available in the assembler overall, so it is a bit of a potentially breaking change.

Right, my mistake.

I agree with the idea of reverting this on 20.x.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that:

  • This PR does not conflict with what the Debian patch does.
  • It in fact will fix a bug in the Debian build.
  • It makes existing implicit behaviour explicit.
  • This does not change the behaviour of an unmodified Clang from a user's perspective, therefore we do not need to release note this.

On that basis, LGTM.

@mstorsjo mstorsjo merged commit 8fa0f0e into llvm:main Jan 16, 2025
12 checks passed
@mstorsjo mstorsjo deleted the armv7-implicit-neon branch January 16, 2025 20:49
mstorsjo added a commit to mstorsjo/llvm-project that referenced this pull request Apr 4, 2025
…ts (llvm#122095)"

This reverts commit 8fa0f0e.

This change broke assembling for e.g. "armv7s-apple-darwin" triples,
which should enable VFPv4 by default (and did that before this
change), but after this change, only NEON/VFPv3 were available.

This is being fixed properly in latest git main as part of
llvm#130623 (possibly as a
split out change), but any proper fix here seems to have too
much potential surprises for an existing release branch.
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Apr 11, 2025
…ts (llvm#122095)"

This reverts commit 8fa0f0e.

This change broke assembling for e.g. "armv7s-apple-darwin" triples,
which should enable VFPv4 by default (and did that before this
change), but after this change, only NEON/VFPv3 were available.

This is being fixed properly in latest git main as part of
llvm#130623 (possibly as a
split out change), but any proper fix here seems to have too
much potential surprises for an existing release branch.
curmudg-eon added a commit to curmudg-eon/llvm-project that referenced this pull request Apr 18, 2025
…ts (llvm#122095)"

This reverts commit 8fa0f0e.

We're reverting this change as it breaks compatibility with Darwin.
More info: llvm#122095 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants