Skip to content

[ARM][Clang] Make +nosimd functional for AArch32 Targets #130623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 15, 2025

Conversation

Stylie777
Copy link
Contributor

@Stylie777 Stylie777 commented Mar 10, 2025

+simd and +nosimd are used to enable or disable NEON Instructions
when compiling for AArch32 Targets. However, up until now, using these
has not been possible. To enable this, these options are mapped to the
relevant LLVM backend option (+neon and -neon) so it can be both
enabled and disabled successfully by the user.

Tests have been added to ensure this behaviour is maintained in the future,
along with updates to existing tests as behaviour has now changed relating
to the use of +simd and +nosimd.

As simd has been mapped within the ARMTargetParser.def, support for this
extension is also added for the --print-support-extensions command when
the target is AArch32. This will print the simd option, along with the
description that relates to the Neon feature. This previously was not
possible as simd did not have a related Feature or Negative Feature.

To make this functional as intended, MVE and MVE.FP now rely on their own
Enum identifier, rather than AEK_SIMD. While SIMD does refer to both
Neon and Helium technologies, in terms of command line options, SIMD relates
to Neon. Helium relates to MVE and MVE.FP. The Enum now reflects this too.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Mar 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 10, 2025

@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Jack Styles (Stylie777)

Changes

+simd and +nosimd are used to emable or disable NEON Instructions
when compiling for AArch32 Targets. However, up until now, using these
has not been possible. To enable this, these options are mapped to the
relevant LLVM backend option (+neon and -neon) so it can be both
enabled and disabled successfully by the user.

Tests have been added to ensure this behaviour is maintained in the future,
along with updates to existing tests as behaviour has now changed relating
to the use of +simd and +nosimd.

As simd has been mapped within the ARMTargetParser.def, support for this
extension is also added for the --print-support-extensions command when
the target is AArch32. This will print the simd option, along with the
description that relates to the Neon feature. This previously was not
possible as simd did not have a related Feature or Negative Feature.

To make this functional as intended, MVE and MVE.FP now rely on their own
Enum identifier, rather than AEK_SIMD. While SIMD does refer to both
Neon and Helium technologies, in terms of command line options, SIMD relates
to Neon. Helium relates to MVE and MVE.FP. The Enum now reflects this too.


Full diff: https://github.com/llvm/llvm-project/pull/130623.diff

9 Files Affected:

  • (modified) clang/test/Driver/print-supported-extensions-arm.c (+1)
  • (modified) clang/test/Preprocessor/arm-target-features.c (+8)
  • (modified) llvm/docs/ReleaseNotes.md (+3)
  • (modified) llvm/include/llvm/TargetParser/ARMTargetParser.def (+8-8)
  • (modified) llvm/include/llvm/TargetParser/ARMTargetParser.h (+1)
  • (modified) llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp (+1-1)
  • (modified) llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp (+1-1)
  • (modified) llvm/lib/TargetParser/ARMTargetParser.cpp (+4)
  • (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+10-11)
diff --git a/clang/test/Driver/print-supported-extensions-arm.c b/clang/test/Driver/print-supported-extensions-arm.c
index 0dc2e9fc69738..407adc5e9c384 100644
--- a/clang/test/Driver/print-supported-extensions-arm.c
+++ b/clang/test/Driver/print-supported-extensions-arm.c
@@ -12,6 +12,7 @@
 // CHECK-NEXT:     dsp                 Supports DSP instructions in ARM and/or Thumb2
 // CHECK-NEXT:     mve                 Support M-Class Vector Extension with integer ops
 // CHECK-NEXT:     mve.fp              Support M-Class Vector Extension with integer and floating ops
+// CHECK-NEXT:     simd                Enable NEON instructions
 // CHECK-NEXT:     fp16                Enable half-precision floating point
 // CHECK-NEXT:     ras                 Enable Reliability, Availability and Serviceability extensions
 // CHECK-NEXT:     fp16fml             Enable full half-precision floating point fml instructions
diff --git a/clang/test/Preprocessor/arm-target-features.c b/clang/test/Preprocessor/arm-target-features.c
index 94dcfc2424bb1..b36dd1765b6de 100644
--- a/clang/test/Preprocessor/arm-target-features.c
+++ b/clang/test/Preprocessor/arm-target-features.c
@@ -1027,3 +1027,11 @@
 // CHECK-R52-NEXT: #define __ARM_VFPV4__ 1
 // CHECK-R52-NOT: #define __ARM_NEON 1
 // CHECK-R52-NOT: #define __ARM_NEON__
+
+// Check that on AArch32 appropriate targets, +nosimd correctly disables NEON instructions.
+// RUN:  %clang -target arm-none-eabi -march=armv8-a+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
+// RUN:  %clang -target arm-none-eabi -mcpu=cortex-r52+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
+// RUN:  %clang -target arm-none-eabi -mcpu=cortex-a57+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
+// CHECK-NOSIMD-NOT: #define __ARM_NEON 1
+// CHECK-NOSIMD-NOT: #define __ARM_NEON_FP 0x6
+// CHECK-NOSIMD-NOT: #define __ARM_NEON__ 1
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index d7a80ae93aa34..43ed754183be7 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -85,6 +85,9 @@ Changes to the AMDGPU Backend
 
 Changes to the ARM Backend
 --------------------------
+* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with
+AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now
+printed when the `--print-supported-extensions` option is used..
 
 Changes to the AVR Backend
 --------------------------
diff --git a/llvm/include/llvm/TargetParser/ARMTargetParser.def b/llvm/include/llvm/TargetParser/ARMTargetParser.def
index 6b96c3e83c8c4..e8837cb89ed29 100644
--- a/llvm/include/llvm/TargetParser/ARMTargetParser.def
+++ b/llvm/include/llvm/TargetParser/ARMTargetParser.def
@@ -224,12 +224,12 @@ ARM_ARCH_EXT_NAME("dotprod", ARM::AEK_DOTPROD, "+dotprod", "-dotprod")
 ARM_ARCH_EXT_NAME("dsp", ARM::AEK_DSP, "+dsp", "-dsp")
 ARM_ARCH_EXT_NAME("fp", ARM::AEK_FP, {}, {})
 ARM_ARCH_EXT_NAME("fp.dp", ARM::AEK_FP_DP, {}, {})
-ARM_ARCH_EXT_NAME("mve", (ARM::AEK_DSP | ARM::AEK_SIMD), "+mve", "-mve")
-ARM_ARCH_EXT_NAME("mve.fp", (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP),
+ARM_ARCH_EXT_NAME("mve", (ARM::AEK_DSP | ARM::AEK_MVE), "+mve", "-mve")
+ARM_ARCH_EXT_NAME("mve.fp", (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP),
                   "+mve.fp", "-mve.fp")
 ARM_ARCH_EXT_NAME("idiv", (ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB), {}, {})
 ARM_ARCH_EXT_NAME("mp", ARM::AEK_MP, {}, {})
-ARM_ARCH_EXT_NAME("simd", ARM::AEK_SIMD, {}, {})
+ARM_ARCH_EXT_NAME("simd", ARM::AEK_SIMD, "+neon", "-neon")
 ARM_ARCH_EXT_NAME("sec", ARM::AEK_SEC, {}, {})
 ARM_ARCH_EXT_NAME("virt", ARM::AEK_VIRT, {}, {})
 ARM_ARCH_EXT_NAME("fp16", ARM::AEK_FP16, "+fullfp16", "-fullfp16")
@@ -334,8 +334,8 @@ ARM_CPU_NAME("cortex-r7", ARMV7R, FK_VFPV3_D16_FP16, false,
              (ARM::AEK_MP | ARM::AEK_HWDIVARM))
 ARM_CPU_NAME("cortex-r8", ARMV7R, FK_VFPV3_D16_FP16, false,
              (ARM::AEK_MP | ARM::AEK_HWDIVARM))
-ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_NONE)
-ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_NONE)
+ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
+ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
 ARM_CPU_NAME("sc300", ARMV7M, FK_NONE, false, ARM::AEK_NONE)
 ARM_CPU_NAME("cortex-m3", ARMV7M, FK_NONE, true, ARM::AEK_NONE)
 ARM_CPU_NAME("cortex-m4", ARMV7EM, FK_FPV4_SP_D16, true, ARM::AEK_NONE)
@@ -345,12 +345,12 @@ ARM_CPU_NAME("cortex-m33", ARMV8MMainline, FK_FPV5_SP_D16, false, ARM::AEK_DSP)
 ARM_CPU_NAME("star-mc1", ARMV8MMainline, FK_FPV5_SP_D16, false, ARM::AEK_DSP)
 ARM_CPU_NAME("cortex-m35p", ARMV8MMainline, FK_FPV5_SP_D16, false, ARM::AEK_DSP)
 ARM_CPU_NAME("cortex-m55", ARMV8_1MMainline, FK_FP_ARMV8_FULLFP16_D16, false,
-             (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP | ARM::AEK_FP16))
+             (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP | ARM::AEK_FP16))
 ARM_CPU_NAME("cortex-m85", ARMV8_1MMainline, FK_FP_ARMV8_FULLFP16_D16, false,
-             (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP | ARM::AEK_FP16 |
+             (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP | ARM::AEK_FP16 |
               ARM::AEK_RAS | ARM::AEK_PACBTI))
 ARM_CPU_NAME("cortex-m52", ARMV8_1MMainline, FK_FP_ARMV8_FULLFP16_D16, false,
-             (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP | ARM::AEK_FP16 |
+             (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP | ARM::AEK_FP16 |
               ARM::AEK_RAS | ARM::AEK_PACBTI))
 ARM_CPU_NAME("cortex-a32", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false, ARM::AEK_CRC)
 ARM_CPU_NAME("cortex-a35", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false, ARM::AEK_CRC)
diff --git a/llvm/include/llvm/TargetParser/ARMTargetParser.h b/llvm/include/llvm/TargetParser/ARMTargetParser.h
index 5dbcfd3d2d693..b2403f42f1b79 100644
--- a/llvm/include/llvm/TargetParser/ARMTargetParser.h
+++ b/llvm/include/llvm/TargetParser/ARMTargetParser.h
@@ -61,6 +61,7 @@ enum ArchExtKind : uint64_t {
   AEK_CDECP6 = 1 << 28,
   AEK_CDECP7 = 1 << 29,
   AEK_PACBTI = 1 << 30,
+  AEK_MVE = 1ULL << 31,
   // Unsupported extensions.
   AEK_OS = 1ULL << 59,
   AEK_IWMMXT = 1ULL << 60,
diff --git a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
index d6a586e1f247b..79b54e6b04330 100644
--- a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
+++ b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
@@ -12971,7 +12971,7 @@ bool ARMAsmParser::enableArchExtFeature(StringRef Name, SMLoc &ExtLoc) {
       {ARM::AEK_CRYPTO,
        {Feature_HasV8Bit},
        {ARM::FeatureCrypto, ARM::FeatureNEON, ARM::FeatureFPARMv8}},
-      {(ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP),
+      {(ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP),
        {Feature_HasV8_1MMainlineBit},
        {ARM::HasMVEFloatOps}},
       {ARM::AEK_FP,
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
index b0fa03a35ec04..632dbebf58f04 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
@@ -248,7 +248,7 @@ void ARMTargetStreamer::emitTargetAttributes(const MCSubtargetInfo &STI) {
         emitFPU(STI.hasFeature(ARM::FeatureFP64) ? ARM::FK_FPV5_D16
                                                  : ARM::FK_FPV5_SP_D16);
         if (STI.hasFeature(ARM::HasMVEFloatOps))
-          emitArchExtension(ARM::AEK_SIMD | ARM::AEK_DSP | ARM::AEK_FP);
+          emitArchExtension(ARM::AEK_MVE | ARM::AEK_DSP | ARM::AEK_FP);
       }
     } else if (STI.hasFeature(ARM::FeatureVFP4_D16_SP))
       emitFPU(STI.hasFeature(ARM::FeatureD32)
diff --git a/llvm/lib/TargetParser/ARMTargetParser.cpp b/llvm/lib/TargetParser/ARMTargetParser.cpp
index 8f9753775c204..a7a895d872668 100644
--- a/llvm/lib/TargetParser/ARMTargetParser.cpp
+++ b/llvm/lib/TargetParser/ARMTargetParser.cpp
@@ -657,6 +657,10 @@ void ARM::PrintSupportedExtensions(StringMap<StringRef> DescMap) {
     // Extensions without a feature cannot be used with -march.
     if (!Ext.Feature.empty()) {
       std::string Description = DescMap[Ext.Name].str();
+      // With SIMD, this links to the NEON feature, so the description should be
+      // taken from here, as SIMD does not exist in TableGen.
+      if (Ext.Name == "simd")
+        Description = DescMap["neon"].str();
       outs() << "    "
              << format(Description.empty() ? "%s\n" : "%-20s%s\n",
                        Ext.Name.str().c_str(), Description.c_str());
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 5d771a1a153f7..2a97d1c80ce8d 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -318,14 +318,14 @@ INSTANTIATE_TEST_SUITE_P(
                                        ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
                                    "7-R"),
         ARMCPUTestParams<uint64_t>("cortex-r52", "armv8-r", "neon-fp-armv8",
-                                   ARM::AEK_NONE | ARM::AEK_CRC | ARM::AEK_MP |
-                                       ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
-                                       ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
+                                   ARM::AEK_CRC | ARM::AEK_MP | ARM::AEK_VIRT |
+                                       ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
+                                       ARM::AEK_DSP | ARM::AEK_SIMD,
                                    "8-R"),
         ARMCPUTestParams<uint64_t>("cortex-r52plus", "armv8-r", "neon-fp-armv8",
-                                   ARM::AEK_NONE | ARM::AEK_CRC | ARM::AEK_MP |
-                                       ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
-                                       ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
+                                   ARM::AEK_CRC | ARM::AEK_MP | ARM::AEK_VIRT |
+                                       ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
+                                       ARM::AEK_DSP | ARM::AEK_SIMD,
                                    "8-R"),
         ARMCPUTestParams<uint64_t>("sc300", "armv7-m", "none",
                                    ARM::AEK_NONE | ARM::AEK_HWDIVTHUMB, "7-M"),
@@ -507,17 +507,17 @@ INSTANTIATE_TEST_SUITE_P(
                                    "8-M.Mainline"),
         ARMCPUTestParams<uint64_t>(
             "cortex-m55", "armv8.1-m.main", "fp-armv8-fullfp16-d16",
-            ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP |
+            ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP |
                 ARM::AEK_RAS | ARM::AEK_LOB | ARM::AEK_FP16,
             "8.1-M.Mainline"),
         ARMCPUTestParams<uint64_t>(
             "cortex-m85", "armv8.1-m.main", "fp-armv8-fullfp16-d16",
-            ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP |
+            ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP |
                 ARM::AEK_RAS | ARM::AEK_LOB | ARM::AEK_FP16 | ARM::AEK_PACBTI,
             "8.1-M.Mainline"),
         ARMCPUTestParams<uint64_t>(
             "cortex-m52", "armv8.1-m.main", "fp-armv8-fullfp16-d16",
-            ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP |
+            ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP |
                 ARM::AEK_RAS | ARM::AEK_LOB | ARM::AEK_FP16 | ARM::AEK_PACBTI,
             "8.1-M.Mainline"),
         ARMCPUTestParams<uint64_t>("iwmmxt", "iwmmxt", "none", ARM::AEK_NONE,
@@ -801,7 +801,7 @@ TEST(TargetParserTest, ARMArchExtFeature) {
                               {"fp", "nofp", nullptr, nullptr},
                               {"idiv", "noidiv", nullptr, nullptr},
                               {"mp", "nomp", nullptr, nullptr},
-                              {"simd", "nosimd", nullptr, nullptr},
+                              {"simd", "nosimd", "+neon", "-neon"},
                               {"sec", "nosec", nullptr, nullptr},
                               {"virt", "novirt", nullptr, nullptr},
                               {"fp16", "nofp16", "+fullfp16", "-fullfp16"},
@@ -1046,7 +1046,6 @@ TEST(TargetParserTest, ARMPrintSupportedExtensions) {
   EXPECT_EQ(std::string::npos, captured.find("invalid"));
   // Should not include anything that lacks a feature name. Checking a few here
   // but not all as if one is hidden correctly the rest should be.
-  EXPECT_EQ(std::string::npos, captured.find("simd"));
   EXPECT_EQ(std::string::npos, captured.find("maverick"));
   EXPECT_EQ(std::string::npos, captured.find("xscale"));
 }

@jthackray
Copy link
Contributor

typo in commit message: s/emable/enable/

@@ -85,6 +85,9 @@ Changes to the AMDGPU Backend

Changes to the ARM Backend
--------------------------
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with
AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now
AArch32 targets, however this will now disable NEON instructions being generated. The `simd` option is also now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -85,6 +85,9 @@ Changes to the AMDGPU Backend

Changes to the ARM Backend
--------------------------
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with
AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now
printed when the `--print-supported-extensions` option is used..
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra trailing period character.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@DavidSpickett
Copy link
Collaborator

used to emable or disable

enable

@Stylie777
Copy link
Contributor Author

The description is fixed. Thanks @DavidSpickett and @jthackray (I will publish a commit to your improvements once I have a green CI so I know all is good).

@Stylie777 Stylie777 force-pushed the nosimd_aarch32_fix branch from 4e50bd3 to 8d28d13 Compare March 11, 2025 09:39
@Stylie777 Stylie777 requested a review from jthackray March 11, 2025 14:44
Copy link
Contributor

@jthackray jthackray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes. LGTM now.

@Stylie777
Copy link
Contributor Author

Thanks @jthackray. The CI Failure seems to be unrelated.

Comment on lines 88 to 90
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with
AArch32 targets, however this will now disable NEON instructions being generated. The `simd` option is
also now printed when the `--print-supported-extensions` option is used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it should be in the clang release notes, not the llvm part? It looks like it is talking about altering the clang interface via +nosimd and print-supported-extensions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I must have put it here as the change has to be made in LLVM but you are correct, +nosimd is a clang option rather than LLVM. I have updated this.

Comment on lines 337 to 338
ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these two special-cased to include AEK_SIMD, unlike all the other cpus where it comes from FK_NEON_FP_ARMV8? Am I right in saying that removing it would not affect how the command line for -mcpu=cortex-r52 behaves? Or does it change the defaults in some way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support NEON on Cortex-R52 by default, rather than as an optional feature. When +simd/+nosimd was linked to +neon and -neon, I found that once the link between simd and neon was made, it was not passing -target-feature +neon as part of the cc1 command.

Let me investigate and see if we need this change or if it can determine +neon from the FPU, if not we may need to add AEK_SIMD to everything that should support it as it may not just be cortex-r52 that is affected here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, yes this is needed. Because simd is now linked to neon, if it is not explicitly defined for each Architecture/CPU that supports NEON, -target-feature -neon is passed by the compiler, disabling the feature. This is because when it parses the .def file, that CPU/Architecture is told SIMD is not supported as it is not included. I think it was a fluke it worked before, and it picked it up from the FPU in the backend, but the responsibility for determining if Neon is supported now lies with the front end with this change, so we need to explicitly define it. I will provide an update in due course that defines it for the Architecture's and CPU's that support NEON and tests to support it. I will also update the description of the PR to state this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would have expected neon vs non-neon to come from the fpu option / default, under Arm. We (essentially) default to -mfpu=auto but it can get quite complex with all the different ways of specifying it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am posing this a question rather than a statement as I am not sure of the answer, but could LLVM be taking the target features defined as a -target-feature option over the features that come from the fpu? Reason I ask is before this change, -target-feature +neon never appeared for targets when I added -### to the command, but now they are. If neon is not defined for that target, it passes -target-feature -neon because the ID of ARM::AEK_SIMD is not included in the features from the ARMTargetFeatures.def file.

That could explain why we need this change.

@Stylie777 Stylie777 force-pushed the nosimd_aarch32_fix branch from 8d28d13 to ece4287 Compare March 12, 2025 08:47
Copy link
Contributor Author

@Stylie777 Stylie777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments @davemgreen. I need to do some more investigation here as I think there is more going wrong than I initially thought after this change is introduced.

Comment on lines 88 to 90
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with
AArch32 targets, however this will now disable NEON instructions being generated. The `simd` option is
also now printed when the `--print-supported-extensions` option is used.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I must have put it here as the change has to be made in LLVM but you are correct, +nosimd is a clang option rather than LLVM. I have updated this.

Comment on lines 337 to 338
ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support NEON on Cortex-R52 by default, rather than as an optional feature. When +simd/+nosimd was linked to +neon and -neon, I found that once the link between simd and neon was made, it was not passing -target-feature +neon as part of the cc1 command.

Let me investigate and see if we need this change or if it can determine +neon from the FPU, if not we may need to add AEK_SIMD to everything that should support it as it may not just be cortex-r52 that is affected here.

@Stylie777
Copy link
Contributor Author

I have just pushed an update for the Unit Tests that should turn the CI green. This is now ready for re-review @davemgreen

@davemgreen
Copy link
Collaborator

Sorry for the delay, my computer got very slow at building things. - What goes wrong if ARM::AEK_SIMD is removed from the CPU and architecture definitions? If it is needed then there are some other cpu's where it might need to be added too. But I'm not sure what needs it. (Target parsing gets quite complex and isn't nearly as well documented as it needs to be. We could do with something that explains how it is meant to work, at a high level so we can have a coherent design).

@Stylie777
Copy link
Contributor Author

Stylie777 commented Mar 14, 2025

No worries 😀

If we remove ARM::AEK_SIMD the target will not be able to process or generate NEON instructions. I have added it to all ArmV8-a (and beyond) architectures and the Cortex-R52 CPU. This should cover all the Cortex-A and Cortex-R targets that support it by default. As ArmV7-A has NEON as an optional feature, I have not added it here. The only thing I may have missed is if Arm V7-A CPU's Support NEON as a mandatory feature, I would need to check this.

Looking at the TableGen for ArmV7, we include FeatureNEON for ArmV7-a and ArmV7-ve. Is this correct? I know I its an optional feature for V7 but are we supporting it by default? If so the .def will need updating to reflect this.

@davemgreen
Copy link
Collaborator

NEON is never mandatory AFAIU in the architecture (FP too). We might assume it to be present though, as I believe it comes from the default -mfpu. (For example FK_CRYPTO_NEON_FP_ARMV8 from armv8-a).

Using something like this: https://godbolt.org/z/EKEMsaMdW. If I take this patch and remove the ARM::AEK_SIMD from the ARM_ARCH and ARM_CPU_NAME definitions it still seems to do OK. Is there some other reason that AEK_SIMD needs to be added to them? I tried some combinations of -march=..[+[no]fp] and -mfpu=... but they seemed OK and the tests passed. ARM::getFPUFeatures gets used in quite a few places though so something else might be going wrong.

@Stylie777
Copy link
Contributor Author

Ok, thats interesting to know. To develop this, I have been using an assembly file with various AdvSIMD instructions, when ARM::AEK_SIMD is removed from the definitions I have found that it can no longer parse the instructions, giving an error that NEON is required. Once ARM::AEK_SIMD has been added to the definitions this keeps working.

It's entirely possible I have missed a change that needs making to the AsmParser, and this may be a better solution, but I found this change was required for the AsmParser to correctly determine that NEON was supported for that target.

@Stylie777 Stylie777 force-pushed the nosimd_aarch32_fix branch from 33d7771 to 98ae944 Compare April 2, 2025 13:04
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - this looks good.

if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) &&
getARMSubArchVersionNumber(Triple) >= 7) {
FPUKind = llvm::ARM::parseFPU("neon");
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
} else if ((!Generic) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this now be else if (!ForAS || !Generic) {. I wasn't sure why armv7 is special? This should mean that -march=armv8-m.main for example works as it did before, and as far as I can tell the versions with cpus remain the same too. (i.e. clang/test/Driver/armv8.1m.main.s should no longer change).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a change made in #122095 that ensured NEON was enabled for generic CPU's on those Triple's as this was not the case previously. Originally I had it as else if (!Generic) as having !ForAS is what was stopping the Driver from collecting the features from the FPU but I found it was not collecting all expected features for specific targets and tests were failing. Let me go back and try a different approach here to get it to be just (!Generic)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, this should have been an else block as the FPU was always handled so this did change behaviour.

cc1as previously never handled the FPU at all, either in the Driver stage or once it had been called, so this is why clang/test/Driver/armv8.1m.main.s has changed, as previously the FPU's features were missed completely. They would be used in cc1 as that considers the FPU.

// used and tested for VFP and NEON Support

// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null -### 2> %t | FileCheck --check-prefix=CHECK-TARGET-FEATURES < %t %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might need to either pipe &2 into stdout, or cat %t | FileCheck..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 41 to 43
- For ARM targets, when using cc1as, the features included in the selected CPU or
Arch's FPU are now loaded and utilized. If you wish not to use a specific feature,
this will need appending to the command line used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What, from a user facing point of view, changes with the new patch? My understanding is that we pass the fpu features from the driver->cc1as now as opposed to calulcating them in cc1as, but it doesn't alter the behaviour otherwise? The examples I tried all work as before, but there are a lot of different possibilities so I might be missing one again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there were some cases such as NEON which now are detected when using cc1as that were not before as I am still not 100% sure if cc1as got the features from the FPU once inside cc1as. I could not find anywhere that this was done anyway.

This change could lead to a user needing to disable features if they wish to do so. However, with assembler I don't think that is as important as if this was cc1 as the assembly instructions will already be defined. This entry was more to ensure people were aware incase there was some confusion as to what this change was and its potential impact.

Copy link
Contributor Author

@Stylie777 Stylie777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davemgreen for the re-review. I have made some updated based on your comments.

if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) &&
getARMSubArchVersionNumber(Triple) >= 7) {
FPUKind = llvm::ARM::parseFPU("neon");
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
} else if ((!Generic) ||
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a change made in #122095 that ensured NEON was enabled for generic CPU's on those Triple's as this was not the case previously. Originally I had it as else if (!Generic) as having !ForAS is what was stopping the Driver from collecting the features from the FPU but I found it was not collecting all expected features for specific targets and tests were failing. Let me go back and try a different approach here to get it to be just (!Generic)

if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) &&
getARMSubArchVersionNumber(Triple) >= 7) {
FPUKind = llvm::ARM::parseFPU("neon");
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
} else if ((!Generic) ||
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, this should have been an else block as the FPU was always handled so this did change behaviour.

cc1as previously never handled the FPU at all, either in the Driver stage or once it had been called, so this is why clang/test/Driver/armv8.1m.main.s has changed, as previously the FPU's features were missed completely. They would be used in cc1 as that considers the FPU.

// used and tested for VFP and NEON Support

// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null -### 2> %t | FileCheck --check-prefix=CHECK-TARGET-FEATURES < %t %s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 41 to 43
- For ARM targets, when using cc1as, the features included in the selected CPU or
Arch's FPU are now loaded and utilized. If you wish not to use a specific feature,
this will need appending to the command line used.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there were some cases such as NEON which now are detected when using cc1as that were not before as I am still not 100% sure if cc1as got the features from the FPU once inside cc1as. I could not find anywhere that this was done anyway.

This change could lead to a user needing to disable features if they wish to do so. However, with assembler I don't think that is as important as if this was cc1 as the assembly instructions will already be defined. This entry was more to ensure people were aware incase there was some confusion as to what this change was and its potential impact.

// activation of NEON for supported targets. The Cortex-R52 will be
// used and tested for VFP and NEON Support

// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires a // REQUIRES: arm-registered-target (like armv8.1.m.main.s has), as this actually does try to compile the code.

Actually compiling code in Clang driver tests would ideally not be done (we should ideally only inspect the generated command lines), but we do seem to have some precedent for it already in armv8.1.m.main.s, and especially for the implicit behaviours that aren't necessarily visible in the command line, I guess there's no other good alternative.

So I guess this kind of test is fine, but add the REQUIRES line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

mstorsjo added a commit to mstorsjo/llvm-project that referenced this pull request Apr 4, 2025
…ts (llvm#122095)"

This reverts commit 8fa0f0e.

This change broke assembling for e.g. "armv7s-apple-darwin" triples,
which should enable VFPv4 by default (and did that before this
change), but after this change, only NEON/VFPv3 were available.

This is being fixed properly in latest git main as part of
llvm#130623 (possibly as a
split out change), but any proper fix here seems to have too
much potential surprises for an existing release branch.
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Apr 11, 2025
…ts (llvm#122095)"

This reverts commit 8fa0f0e.

This change broke assembling for e.g. "armv7s-apple-darwin" triples,
which should enable VFPv4 by default (and did that before this
change), but after this change, only NEON/VFPv3 were available.

This is being fixed properly in latest git main as part of
llvm#130623 (possibly as a
split out change), but any proper fix here seems to have too
much potential surprises for an existing release branch.
Stylie777 added a commit that referenced this pull request Apr 14, 2025
…134612)

Previously, `cc1as` did not consider the Features that can be included
from a target's FPU. This could lead to a situation where assembly files
could not compile as cc1as did not know if a feature was supported.

With this change, all the features for the FPU will be passed to `cc1as`
as `-target-feature` lines. By making this change, it will enable
`+nosimd` to be functional, worked on in #130623, and fix a regression
introduced in 8fa0f0e so
armv7s-apple-darwin targets can utilise VFPv4 correctly.

---------

Co-authored-by: Martin Storsjö <[email protected]>
Previously, the use of MVE or MVE.FP would be defined
by using the `ARM::AEK_SIMD` identifier. SIMD relates to
the Cortex-A and Cortex-R extension that enables NEON
instructions, which is called MVE for Cortex-M. To enable the
linking of `+simd` and `+nosimd` to `+neon` and `-neon`
when using clang, MVE and MVE.FP can now be defined
using a unique enum identifier, rather than using the existing
identifier for SIMD.

This was originally planned to be merged as part of llvm#130296 but
the changes made `+nosimd` an invalid argument, which, while not
having any functionality, was allowed in previous versions of
LLVM. To avoid regressions being introduced, this has been
combined with the fix for `+nosimd` on AArch32.
`+simd` and `+nosimd` are used to enable or disable NEON Instructions
when compiling for AArch32 Targets. However, up until now, using these
has not been possible. To enable this, these options are mapped to the
relevant LLVM backend option (`+neon` and `-neon`) so it can be both
enabled and disabled successfully by the user.

Tests have been added to ensure this behaviour is maintained in the future,
along with updates to existing tests as behaviour has now changed relating
to the use of `+simd` and `+nosimd`.

As `simd` has been mapped within the ARMTargetParser.def, support for this
extension is also added for the `--print-support-extensions` command when
the target is AArch32. This will print the `simd` option, along with the
description that relates to the Neon feature. This previously was not
possible as `simd` did not have a related Feature or Negative Feature.
To ensure NEON is activated for the ArmV8-A cores and
beyond, ARM::AEK_SIMD needs to be added for the
appropriate architectures. Appropriate tests have been
added to support this change.
Previously, FPU features were not collected when forming a list
of features for the Assembler. This also allows NEON to be pulled
from the FPU support rather than having it hardcoded into the
Target Parser Definitions.
@Stylie777 Stylie777 force-pushed the nosimd_aarch32_fix branch from 9690bf6 to 0e0b4b5 Compare April 14, 2025 08:31
@Stylie777
Copy link
Contributor Author

I have rebased this now after #134612 has been merged. This PR is only now looking at enabling +nosimd for ARM targets.

@Stylie777 Stylie777 requested a review from davemgreen April 14, 2025 08:33
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. LGTM

@@ -419,6 +419,9 @@ Bug Fixes to Attribute Support
- No longer crashing on ``__attribute__((align_value(N)))`` during template
instantiation when the function parameter type is not a pointer or reference.
(#GH26612)
- The ``+nosimd`` attribute is now fully supported for AArch32. Previously, this had no effect when being used with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a comment be added to the arm section too / instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved it to the ARM section.

Copy link
Contributor Author

@Stylie777 Stylie777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davemgreen for the review.

@@ -419,6 +419,9 @@ Bug Fixes to Attribute Support
- No longer crashing on ``__attribute__((align_value(N)))`` during template
instantiation when the function parameter type is not a pointer or reference.
(#GH26612)
- The ``+nosimd`` attribute is now fully supported for AArch32. Previously, this had no effect when being used with
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved it to the ARM section.

@Stylie777 Stylie777 merged commit 06da00a into llvm:main Apr 15, 2025
11 of 12 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 15, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-hwasan running on sanitizer-buildbot11 while building clang,llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/9905

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 87743 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 
FAIL: LLVM :: ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s (53457 of 87743)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-windows-msvc /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp # RUN: at line 1
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-windows-msvc /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp
not /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-jitlink -noexec /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp 2>&1 | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s # RUN: at line 2
+ not /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-jitlink -noexec /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
56.53s: Clang :: Driver/fsanitize.c
40.75s: Clang :: Preprocessor/riscv-target-features.c
38.17s: Clang :: Driver/arm-cortex-cpus-2.c
37.66s: Clang :: Driver/arm-cortex-cpus-1.c
36.05s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
33.83s: Clang :: OpenMP/target_defaultmap_codegen_01.cpp
32.35s: Clang :: OpenMP/target_update_codegen.cpp
29.51s: Clang :: Preprocessor/arm-target-features.c
29.16s: LLVM :: CodeGen/RISCV/attributes.ll
29.01s: Clang :: Preprocessor/aarch64-target-features.c
27.12s: Clang :: Driver/clang_f_opts.c
24.86s: Clang :: Driver/linux-ld.c
23.92s: Clang :: Preprocessor/predefined-arch-macros.c
23.74s: LLVM :: tools/llvm-reduce/parallel-workitem-kill.ll
23.71s: LLVM :: CodeGen/ARM/build-attributes.ll
23.01s: Clang :: Driver/cl-options.c
21.59s: Clang :: Driver/x86-target-features.c
20.03s: Clang :: CodeGen/AArch64/sve-intrinsics/acle_sve_reinterpret.c
19.17s: Clang :: Analysis/a_flaky_crash.cpp
18.79s: Clang :: Driver/debug-options.c

Step 11 (stage2/hwasan check) failure: stage2/hwasan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 87743 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 
FAIL: LLVM :: ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s (53457 of 87743)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-windows-msvc /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp # RUN: at line 1
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-windows-msvc /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp
not /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-jitlink -noexec /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp 2>&1 | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s # RUN: at line 2
+ not /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-jitlink -noexec /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test/ExecutionEngine/JITLink/x86-64/Output/COFF_directive_alternatename_fail.s.tmp
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/COFF_directive_alternatename_fail.s

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
56.53s: Clang :: Driver/fsanitize.c
40.75s: Clang :: Preprocessor/riscv-target-features.c
38.17s: Clang :: Driver/arm-cortex-cpus-2.c
37.66s: Clang :: Driver/arm-cortex-cpus-1.c
36.05s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
33.83s: Clang :: OpenMP/target_defaultmap_codegen_01.cpp
32.35s: Clang :: OpenMP/target_update_codegen.cpp
29.51s: Clang :: Preprocessor/arm-target-features.c
29.16s: LLVM :: CodeGen/RISCV/attributes.ll
29.01s: Clang :: Preprocessor/aarch64-target-features.c
27.12s: Clang :: Driver/clang_f_opts.c
24.86s: Clang :: Driver/linux-ld.c
23.92s: Clang :: Preprocessor/predefined-arch-macros.c
23.74s: LLVM :: tools/llvm-reduce/parallel-workitem-kill.ll
23.71s: LLVM :: CodeGen/ARM/build-attributes.ll
23.01s: Clang :: Driver/cl-options.c
21.59s: Clang :: Driver/x86-target-features.c
20.03s: Clang :: CodeGen/AArch64/sve-intrinsics/acle_sve_reinterpret.c
19.17s: Clang :: Analysis/a_flaky_crash.cpp
18.79s: Clang :: Driver/debug-options.c


var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…lvm#134612)

Previously, `cc1as` did not consider the Features that can be included
from a target's FPU. This could lead to a situation where assembly files
could not compile as cc1as did not know if a feature was supported.

With this change, all the features for the FPU will be passed to `cc1as`
as `-target-feature` lines. By making this change, it will enable
`+nosimd` to be functional, worked on in llvm#130623, and fix a regression
introduced in 8fa0f0e so
armv7s-apple-darwin targets can utilise VFPv4 correctly.

---------

Co-authored-by: Martin Storsjö <[email protected]>
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
`+simd` and `+nosimd` are used to enable or disable NEON Instructions
when compiling for ARM Targets. However, up until now, using these
has not been possible. To enable this, these options are mapped to the
relevant LLVM backend option (`+neon` and `-neon`) so it can be both
enabled and disabled successfully by the user.

Tests have been added to ensure this behaviour is maintained in the
future, along with updates to existing tests as behaviour has now changed
relating to the use of `+simd` and `+nosimd`.

As `simd` has been mapped within the ARMTargetParser.def, support for
this extension is also added for the `--print-support-extensions` command
when the target is AArch32. This will print the `simd` option, along with the
description that relates to the Neon feature. This previously was not
possible as `simd` did not have a related Feature or Negative Feature.

To make this functional as intended, MVE and MVE.FP now rely on their
own Enum identifier, rather than `AEK_SIMD`. While SIMD does refer to
both Neon and Helium technologies, in terms of command line options,
SIMD relates to Neon. Helium relates to MVE and MVE.FP. The Enum
now reflects this too.
Stylie777 added a commit to Stylie777/llvm-project that referenced this pull request Apr 28, 2025
In llvm#130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEONS is disabled in
the command line, it did not disable features that have NEON as
a requirement, such as Crypto or AES. To achieve this, clang will
now check if SIMD has been disabled when using a NEON supported
FPU. If this is the case, it will disable all features that depend
on NEON.

While working on this Patch, I spotted that for features that rely
on NEON, when used, do not enable NEON in the Driver. This meant
that when using AES for example, NEON would not be activated. NEON
is required when using features such as AES, so this is now enabled
when using such features.
Stylie777 added a commit that referenced this pull request Apr 29, 2025
In #130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).

This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
gizmondo pushed a commit to gizmondo/llvm-project that referenced this pull request Apr 29, 2025
)

In llvm#130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).

This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
)

In llvm#130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).

This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
)

In llvm#130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).

This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
)

In llvm#130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).

This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
)

In llvm#130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).

This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
)

In llvm#130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).

This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 15, 2025
…arwin targets (#122095)"

This reverts commit 8fa0f0e.

This change broke assembling for e.g. "armv7s-apple-darwin" triples,
which should enable VFPv4 by default (and did that before this
change), but after this change, only NEON/VFPv3 were available.

This is being fixed properly in latest git main as part of
llvm/llvm-project#130623 (possibly as a
split out change), but any proper fix here seems to have too
much potential surprises for an existing release branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants