-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[ARM][Clang] Make +nosimd
functional for AArch32 Targets
#130623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-arm @llvm/pr-subscribers-clang Author: Jack Styles (Stylie777) Changes
Tests have been added to ensure this behaviour is maintained in the future, As To make this functional as intended, MVE and MVE.FP now rely on their own Full diff: https://github.com/llvm/llvm-project/pull/130623.diff 9 Files Affected:
diff --git a/clang/test/Driver/print-supported-extensions-arm.c b/clang/test/Driver/print-supported-extensions-arm.c
index 0dc2e9fc69738..407adc5e9c384 100644
--- a/clang/test/Driver/print-supported-extensions-arm.c
+++ b/clang/test/Driver/print-supported-extensions-arm.c
@@ -12,6 +12,7 @@
// CHECK-NEXT: dsp Supports DSP instructions in ARM and/or Thumb2
// CHECK-NEXT: mve Support M-Class Vector Extension with integer ops
// CHECK-NEXT: mve.fp Support M-Class Vector Extension with integer and floating ops
+// CHECK-NEXT: simd Enable NEON instructions
// CHECK-NEXT: fp16 Enable half-precision floating point
// CHECK-NEXT: ras Enable Reliability, Availability and Serviceability extensions
// CHECK-NEXT: fp16fml Enable full half-precision floating point fml instructions
diff --git a/clang/test/Preprocessor/arm-target-features.c b/clang/test/Preprocessor/arm-target-features.c
index 94dcfc2424bb1..b36dd1765b6de 100644
--- a/clang/test/Preprocessor/arm-target-features.c
+++ b/clang/test/Preprocessor/arm-target-features.c
@@ -1027,3 +1027,11 @@
// CHECK-R52-NEXT: #define __ARM_VFPV4__ 1
// CHECK-R52-NOT: #define __ARM_NEON 1
// CHECK-R52-NOT: #define __ARM_NEON__
+
+// Check that on AArch32 appropriate targets, +nosimd correctly disables NEON instructions.
+// RUN: %clang -target arm-none-eabi -march=armv8-a+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
+// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
+// RUN: %clang -target arm-none-eabi -mcpu=cortex-a57+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
+// CHECK-NOSIMD-NOT: #define __ARM_NEON 1
+// CHECK-NOSIMD-NOT: #define __ARM_NEON_FP 0x6
+// CHECK-NOSIMD-NOT: #define __ARM_NEON__ 1
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index d7a80ae93aa34..43ed754183be7 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -85,6 +85,9 @@ Changes to the AMDGPU Backend
Changes to the ARM Backend
--------------------------
+* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with
+AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now
+printed when the `--print-supported-extensions` option is used..
Changes to the AVR Backend
--------------------------
diff --git a/llvm/include/llvm/TargetParser/ARMTargetParser.def b/llvm/include/llvm/TargetParser/ARMTargetParser.def
index 6b96c3e83c8c4..e8837cb89ed29 100644
--- a/llvm/include/llvm/TargetParser/ARMTargetParser.def
+++ b/llvm/include/llvm/TargetParser/ARMTargetParser.def
@@ -224,12 +224,12 @@ ARM_ARCH_EXT_NAME("dotprod", ARM::AEK_DOTPROD, "+dotprod", "-dotprod")
ARM_ARCH_EXT_NAME("dsp", ARM::AEK_DSP, "+dsp", "-dsp")
ARM_ARCH_EXT_NAME("fp", ARM::AEK_FP, {}, {})
ARM_ARCH_EXT_NAME("fp.dp", ARM::AEK_FP_DP, {}, {})
-ARM_ARCH_EXT_NAME("mve", (ARM::AEK_DSP | ARM::AEK_SIMD), "+mve", "-mve")
-ARM_ARCH_EXT_NAME("mve.fp", (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP),
+ARM_ARCH_EXT_NAME("mve", (ARM::AEK_DSP | ARM::AEK_MVE), "+mve", "-mve")
+ARM_ARCH_EXT_NAME("mve.fp", (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP),
"+mve.fp", "-mve.fp")
ARM_ARCH_EXT_NAME("idiv", (ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB), {}, {})
ARM_ARCH_EXT_NAME("mp", ARM::AEK_MP, {}, {})
-ARM_ARCH_EXT_NAME("simd", ARM::AEK_SIMD, {}, {})
+ARM_ARCH_EXT_NAME("simd", ARM::AEK_SIMD, "+neon", "-neon")
ARM_ARCH_EXT_NAME("sec", ARM::AEK_SEC, {}, {})
ARM_ARCH_EXT_NAME("virt", ARM::AEK_VIRT, {}, {})
ARM_ARCH_EXT_NAME("fp16", ARM::AEK_FP16, "+fullfp16", "-fullfp16")
@@ -334,8 +334,8 @@ ARM_CPU_NAME("cortex-r7", ARMV7R, FK_VFPV3_D16_FP16, false,
(ARM::AEK_MP | ARM::AEK_HWDIVARM))
ARM_CPU_NAME("cortex-r8", ARMV7R, FK_VFPV3_D16_FP16, false,
(ARM::AEK_MP | ARM::AEK_HWDIVARM))
-ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_NONE)
-ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_NONE)
+ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
+ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD)
ARM_CPU_NAME("sc300", ARMV7M, FK_NONE, false, ARM::AEK_NONE)
ARM_CPU_NAME("cortex-m3", ARMV7M, FK_NONE, true, ARM::AEK_NONE)
ARM_CPU_NAME("cortex-m4", ARMV7EM, FK_FPV4_SP_D16, true, ARM::AEK_NONE)
@@ -345,12 +345,12 @@ ARM_CPU_NAME("cortex-m33", ARMV8MMainline, FK_FPV5_SP_D16, false, ARM::AEK_DSP)
ARM_CPU_NAME("star-mc1", ARMV8MMainline, FK_FPV5_SP_D16, false, ARM::AEK_DSP)
ARM_CPU_NAME("cortex-m35p", ARMV8MMainline, FK_FPV5_SP_D16, false, ARM::AEK_DSP)
ARM_CPU_NAME("cortex-m55", ARMV8_1MMainline, FK_FP_ARMV8_FULLFP16_D16, false,
- (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP | ARM::AEK_FP16))
+ (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP | ARM::AEK_FP16))
ARM_CPU_NAME("cortex-m85", ARMV8_1MMainline, FK_FP_ARMV8_FULLFP16_D16, false,
- (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP | ARM::AEK_FP16 |
+ (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP | ARM::AEK_FP16 |
ARM::AEK_RAS | ARM::AEK_PACBTI))
ARM_CPU_NAME("cortex-m52", ARMV8_1MMainline, FK_FP_ARMV8_FULLFP16_D16, false,
- (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP | ARM::AEK_FP16 |
+ (ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP | ARM::AEK_FP16 |
ARM::AEK_RAS | ARM::AEK_PACBTI))
ARM_CPU_NAME("cortex-a32", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false, ARM::AEK_CRC)
ARM_CPU_NAME("cortex-a35", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false, ARM::AEK_CRC)
diff --git a/llvm/include/llvm/TargetParser/ARMTargetParser.h b/llvm/include/llvm/TargetParser/ARMTargetParser.h
index 5dbcfd3d2d693..b2403f42f1b79 100644
--- a/llvm/include/llvm/TargetParser/ARMTargetParser.h
+++ b/llvm/include/llvm/TargetParser/ARMTargetParser.h
@@ -61,6 +61,7 @@ enum ArchExtKind : uint64_t {
AEK_CDECP6 = 1 << 28,
AEK_CDECP7 = 1 << 29,
AEK_PACBTI = 1 << 30,
+ AEK_MVE = 1ULL << 31,
// Unsupported extensions.
AEK_OS = 1ULL << 59,
AEK_IWMMXT = 1ULL << 60,
diff --git a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
index d6a586e1f247b..79b54e6b04330 100644
--- a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
+++ b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
@@ -12971,7 +12971,7 @@ bool ARMAsmParser::enableArchExtFeature(StringRef Name, SMLoc &ExtLoc) {
{ARM::AEK_CRYPTO,
{Feature_HasV8Bit},
{ARM::FeatureCrypto, ARM::FeatureNEON, ARM::FeatureFPARMv8}},
- {(ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP),
+ {(ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP),
{Feature_HasV8_1MMainlineBit},
{ARM::HasMVEFloatOps}},
{ARM::AEK_FP,
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
index b0fa03a35ec04..632dbebf58f04 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
@@ -248,7 +248,7 @@ void ARMTargetStreamer::emitTargetAttributes(const MCSubtargetInfo &STI) {
emitFPU(STI.hasFeature(ARM::FeatureFP64) ? ARM::FK_FPV5_D16
: ARM::FK_FPV5_SP_D16);
if (STI.hasFeature(ARM::HasMVEFloatOps))
- emitArchExtension(ARM::AEK_SIMD | ARM::AEK_DSP | ARM::AEK_FP);
+ emitArchExtension(ARM::AEK_MVE | ARM::AEK_DSP | ARM::AEK_FP);
}
} else if (STI.hasFeature(ARM::FeatureVFP4_D16_SP))
emitFPU(STI.hasFeature(ARM::FeatureD32)
diff --git a/llvm/lib/TargetParser/ARMTargetParser.cpp b/llvm/lib/TargetParser/ARMTargetParser.cpp
index 8f9753775c204..a7a895d872668 100644
--- a/llvm/lib/TargetParser/ARMTargetParser.cpp
+++ b/llvm/lib/TargetParser/ARMTargetParser.cpp
@@ -657,6 +657,10 @@ void ARM::PrintSupportedExtensions(StringMap<StringRef> DescMap) {
// Extensions without a feature cannot be used with -march.
if (!Ext.Feature.empty()) {
std::string Description = DescMap[Ext.Name].str();
+ // With SIMD, this links to the NEON feature, so the description should be
+ // taken from here, as SIMD does not exist in TableGen.
+ if (Ext.Name == "simd")
+ Description = DescMap["neon"].str();
outs() << " "
<< format(Description.empty() ? "%s\n" : "%-20s%s\n",
Ext.Name.str().c_str(), Description.c_str());
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 5d771a1a153f7..2a97d1c80ce8d 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -318,14 +318,14 @@ INSTANTIATE_TEST_SUITE_P(
ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
"7-R"),
ARMCPUTestParams<uint64_t>("cortex-r52", "armv8-r", "neon-fp-armv8",
- ARM::AEK_NONE | ARM::AEK_CRC | ARM::AEK_MP |
- ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
- ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
+ ARM::AEK_CRC | ARM::AEK_MP | ARM::AEK_VIRT |
+ ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
+ ARM::AEK_DSP | ARM::AEK_SIMD,
"8-R"),
ARMCPUTestParams<uint64_t>("cortex-r52plus", "armv8-r", "neon-fp-armv8",
- ARM::AEK_NONE | ARM::AEK_CRC | ARM::AEK_MP |
- ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
- ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
+ ARM::AEK_CRC | ARM::AEK_MP | ARM::AEK_VIRT |
+ ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
+ ARM::AEK_DSP | ARM::AEK_SIMD,
"8-R"),
ARMCPUTestParams<uint64_t>("sc300", "armv7-m", "none",
ARM::AEK_NONE | ARM::AEK_HWDIVTHUMB, "7-M"),
@@ -507,17 +507,17 @@ INSTANTIATE_TEST_SUITE_P(
"8-M.Mainline"),
ARMCPUTestParams<uint64_t>(
"cortex-m55", "armv8.1-m.main", "fp-armv8-fullfp16-d16",
- ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP |
ARM::AEK_RAS | ARM::AEK_LOB | ARM::AEK_FP16,
"8.1-M.Mainline"),
ARMCPUTestParams<uint64_t>(
"cortex-m85", "armv8.1-m.main", "fp-armv8-fullfp16-d16",
- ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP |
ARM::AEK_RAS | ARM::AEK_LOB | ARM::AEK_FP16 | ARM::AEK_PACBTI,
"8.1-M.Mainline"),
ARMCPUTestParams<uint64_t>(
"cortex-m52", "armv8.1-m.main", "fp-armv8-fullfp16-d16",
- ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_MVE | ARM::AEK_FP |
ARM::AEK_RAS | ARM::AEK_LOB | ARM::AEK_FP16 | ARM::AEK_PACBTI,
"8.1-M.Mainline"),
ARMCPUTestParams<uint64_t>("iwmmxt", "iwmmxt", "none", ARM::AEK_NONE,
@@ -801,7 +801,7 @@ TEST(TargetParserTest, ARMArchExtFeature) {
{"fp", "nofp", nullptr, nullptr},
{"idiv", "noidiv", nullptr, nullptr},
{"mp", "nomp", nullptr, nullptr},
- {"simd", "nosimd", nullptr, nullptr},
+ {"simd", "nosimd", "+neon", "-neon"},
{"sec", "nosec", nullptr, nullptr},
{"virt", "novirt", nullptr, nullptr},
{"fp16", "nofp16", "+fullfp16", "-fullfp16"},
@@ -1046,7 +1046,6 @@ TEST(TargetParserTest, ARMPrintSupportedExtensions) {
EXPECT_EQ(std::string::npos, captured.find("invalid"));
// Should not include anything that lacks a feature name. Checking a few here
// but not all as if one is hidden correctly the rest should be.
- EXPECT_EQ(std::string::npos, captured.find("simd"));
EXPECT_EQ(std::string::npos, captured.find("maverick"));
EXPECT_EQ(std::string::npos, captured.find("xscale"));
}
|
typo in commit message: s/emable/enable/ |
llvm/docs/ReleaseNotes.md
Outdated
@@ -85,6 +85,9 @@ Changes to the AMDGPU Backend | |||
|
|||
Changes to the ARM Backend | |||
-------------------------- | |||
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with | |||
AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now | |
AArch32 targets, however this will now disable NEON instructions being generated. The `simd` option is also now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
llvm/docs/ReleaseNotes.md
Outdated
@@ -85,6 +85,9 @@ Changes to the AMDGPU Backend | |||
|
|||
Changes to the ARM Backend | |||
-------------------------- | |||
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with | |||
AArch32 targets, however will now disable NEON instructions being generated. The `simd` is also now | |||
printed when the `--print-supported-extensions` option is used.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra trailing period character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
enable |
The description is fixed. Thanks @DavidSpickett and @jthackray (I will publish a commit to your improvements once I have a green CI so I know all is good). |
4e50bd3
to
8d28d13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes. LGTM now.
Thanks @jthackray. The CI Failure seems to be unrelated. |
llvm/docs/ReleaseNotes.md
Outdated
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with | ||
AArch32 targets, however this will now disable NEON instructions being generated. The `simd` option is | ||
also now printed when the `--print-supported-extensions` option is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it should be in the clang release notes, not the llvm part? It looks like it is talking about altering the clang interface via +nosimd and print-supported-extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I must have put it here as the change has to be made in LLVM but you are correct, +nosimd
is a clang option rather than LLVM. I have updated this.
ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD) | ||
ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these two special-cased to include AEK_SIMD, unlike all the other cpus where it comes from FK_NEON_FP_ARMV8? Am I right in saying that removing it would not affect how the command line for -mcpu=cortex-r52 behaves? Or does it change the defaults in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support NEON on Cortex-R52 by default, rather than as an optional feature. When +simd
/+nosimd
was linked to +neon
and -neon
, I found that once the link between simd
and neon
was made, it was not passing -target-feature +neon
as part of the cc1 command.
Let me investigate and see if we need this change or if it can determine +neon
from the FPU, if not we may need to add AEK_SIMD
to everything that should support it as it may not just be cortex-r52 that is affected here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, yes this is needed. Because simd
is now linked to neon
, if it is not explicitly defined for each Architecture/CPU that supports NEON, -target-feature -neon
is passed by the compiler, disabling the feature. This is because when it parses the .def
file, that CPU/Architecture is told SIMD is not supported as it is not included. I think it was a fluke it worked before, and it picked it up from the FPU in the backend, but the responsibility for determining if Neon is supported now lies with the front end with this change, so we need to explicitly define it. I will provide an update in due course that defines it for the Architecture's and CPU's that support NEON and tests to support it. I will also update the description of the PR to state this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would have expected neon vs non-neon to come from the fpu option / default, under Arm. We (essentially) default to -mfpu=auto but it can get quite complex with all the different ways of specifying it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am posing this a question rather than a statement as I am not sure of the answer, but could LLVM be taking the target features defined as a -target-feature
option over the features that come from the fpu? Reason I ask is before this change, -target-feature +neon
never appeared for targets when I added -###
to the command, but now they are. If neon is not defined for that target, it passes -target-feature -neon
because the ID of ARM::AEK_SIMD
is not included in the features from the ARMTargetFeatures.def
file.
That could explain why we need this change.
8d28d13
to
ece4287
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments @davemgreen. I need to do some more investigation here as I think there is more going wrong than I initially thought after this change is introduced.
llvm/docs/ReleaseNotes.md
Outdated
* The `+nosimd` attribute is now fully supported. Previously, this had no effect when being used with | ||
AArch32 targets, however this will now disable NEON instructions being generated. The `simd` option is | ||
also now printed when the `--print-supported-extensions` option is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I must have put it here as the change has to be made in LLVM but you are correct, +nosimd
is a clang option rather than LLVM. I have updated this.
ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD) | ||
ARM_CPU_NAME("cortex-r52plus", ARMV8R, FK_NEON_FP_ARMV8, false, ARM::AEK_SIMD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support NEON on Cortex-R52 by default, rather than as an optional feature. When +simd
/+nosimd
was linked to +neon
and -neon
, I found that once the link between simd
and neon
was made, it was not passing -target-feature +neon
as part of the cc1 command.
Let me investigate and see if we need this change or if it can determine +neon
from the FPU, if not we may need to add AEK_SIMD
to everything that should support it as it may not just be cortex-r52 that is affected here.
I have just pushed an update for the Unit Tests that should turn the CI green. This is now ready for re-review @davemgreen |
Sorry for the delay, my computer got very slow at building things. - What goes wrong if ARM::AEK_SIMD is removed from the CPU and architecture definitions? If it is needed then there are some other cpu's where it might need to be added too. But I'm not sure what needs it. (Target parsing gets quite complex and isn't nearly as well documented as it needs to be. We could do with something that explains how it is meant to work, at a high level so we can have a coherent design). |
No worries 😀 If we remove Looking at the TableGen for ArmV7, we include |
NEON is never mandatory AFAIU in the architecture (FP too). We might assume it to be present though, as I believe it comes from the default -mfpu. (For example FK_CRYPTO_NEON_FP_ARMV8 from armv8-a). Using something like this: https://godbolt.org/z/EKEMsaMdW. If I take this patch and remove the ARM::AEK_SIMD from the ARM_ARCH and ARM_CPU_NAME definitions it still seems to do OK. Is there some other reason that AEK_SIMD needs to be added to them? I tried some combinations of -march=..[+[no]fp] and -mfpu=... but they seemed OK and the tests passed. ARM::getFPUFeatures gets used in quite a few places though so something else might be going wrong. |
Ok, thats interesting to know. To develop this, I have been using an assembly file with various AdvSIMD instructions, when ARM::AEK_SIMD is removed from the definitions I have found that it can no longer parse the instructions, giving an error that NEON is required. Once ARM::AEK_SIMD has been added to the definitions this keeps working. It's entirely possible I have missed a change that needs making to the AsmParser, and this may be a better solution, but I found this change was required for the AsmParser to correctly determine that NEON was supported for that target. |
33d7771
to
98ae944
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - this looks good.
if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) && | ||
getARMSubArchVersionNumber(Triple) >= 7) { | ||
FPUKind = llvm::ARM::parseFPU("neon"); | ||
(void)llvm::ARM::getFPUFeatures(FPUKind, Features); | ||
} else if ((!Generic) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this now be else if (!ForAS || !Generic) {
. I wasn't sure why armv7 is special? This should mean that -march=armv8-m.main
for example works as it did before, and as far as I can tell the versions with cpus remain the same too. (i.e. clang/test/Driver/armv8.1m.main.s should no longer change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a change made in #122095 that ensured NEON was enabled for generic CPU's on those Triple's as this was not the case previously. Originally I had it as else if (!Generic)
as having !ForAS
is what was stopping the Driver from collecting the features from the FPU but I found it was not collecting all expected features for specific targets and tests were failing. Let me go back and try a different approach here to get it to be just (!Generic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection, this should have been an else block as the FPU was always handled so this did change behaviour.
cc1as previously never handled the FPU at all, either in the Driver stage or once it had been called, so this is why clang/test/Driver/armv8.1m.main.s
has changed, as previously the FPU's features were missed completely. They would be used in cc1 as that considers the FPU.
// used and tested for VFP and NEON Support | ||
|
||
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0 | ||
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null -### 2> %t | FileCheck --check-prefix=CHECK-TARGET-FEATURES < %t %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might need to either pipe &2 into stdout, or cat %t | FileCheck..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
clang/docs/ReleaseNotes.rst
Outdated
- For ARM targets, when using cc1as, the features included in the selected CPU or | ||
Arch's FPU are now loaded and utilized. If you wish not to use a specific feature, | ||
this will need appending to the command line used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What, from a user facing point of view, changes with the new patch? My understanding is that we pass the fpu features from the driver->cc1as now as opposed to calulcating them in cc1as, but it doesn't alter the behaviour otherwise? The examples I tried all work as before, but there are a lot of different possibilities so I might be missing one again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there were some cases such as NEON which now are detected when using cc1as
that were not before as I am still not 100% sure if cc1as got the features from the FPU once inside cc1as
. I could not find anywhere that this was done anyway.
This change could lead to a user needing to disable features if they wish to do so. However, with assembler I don't think that is as important as if this was cc1
as the assembly instructions will already be defined. This entry was more to ensure people were aware incase there was some confusion as to what this change was and its potential impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @davemgreen for the re-review. I have made some updated based on your comments.
if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) && | ||
getARMSubArchVersionNumber(Triple) >= 7) { | ||
FPUKind = llvm::ARM::parseFPU("neon"); | ||
(void)llvm::ARM::getFPUFeatures(FPUKind, Features); | ||
} else if ((!Generic) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a change made in #122095 that ensured NEON was enabled for generic CPU's on those Triple's as this was not the case previously. Originally I had it as else if (!Generic)
as having !ForAS
is what was stopping the Driver from collecting the features from the FPU but I found it was not collecting all expected features for specific targets and tests were failing. Let me go back and try a different approach here to get it to be just (!Generic)
if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) && | ||
getARMSubArchVersionNumber(Triple) >= 7) { | ||
FPUKind = llvm::ARM::parseFPU("neon"); | ||
(void)llvm::ARM::getFPUFeatures(FPUKind, Features); | ||
} else if ((!Generic) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection, this should have been an else block as the FPU was always handled so this did change behaviour.
cc1as previously never handled the FPU at all, either in the Driver stage or once it had been called, so this is why clang/test/Driver/armv8.1m.main.s
has changed, as previously the FPU's features were missed completely. They would be used in cc1 as that considers the FPU.
// used and tested for VFP and NEON Support | ||
|
||
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0 | ||
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null -### 2> %t | FileCheck --check-prefix=CHECK-TARGET-FEATURES < %t %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
clang/docs/ReleaseNotes.rst
Outdated
- For ARM targets, when using cc1as, the features included in the selected CPU or | ||
Arch's FPU are now loaded and utilized. If you wish not to use a specific feature, | ||
this will need appending to the command line used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there were some cases such as NEON which now are detected when using cc1as
that were not before as I am still not 100% sure if cc1as got the features from the FPU once inside cc1as
. I could not find anywhere that this was done anyway.
This change could lead to a user needing to disable features if they wish to do so. However, with assembler I don't think that is as important as if this was cc1
as the assembly instructions will already be defined. This entry was more to ensure people were aware incase there was some confusion as to what this change was and its potential impact.
// activation of NEON for supported targets. The Cortex-R52 will be | ||
// used and tested for VFP and NEON Support | ||
|
||
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires a // REQUIRES: arm-registered-target
(like armv8.1.m.main.s
has), as this actually does try to compile the code.
Actually compiling code in Clang driver tests would ideally not be done (we should ideally only inspect the generated command lines), but we do seem to have some precedent for it already in armv8.1.m.main.s
, and especially for the implicit behaviours that aren't necessarily visible in the command line, I guess there's no other good alternative.
So I guess this kind of test is fine, but add the REQUIRES
line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…ts (llvm#122095)" This reverts commit 8fa0f0e. This change broke assembling for e.g. "armv7s-apple-darwin" triples, which should enable VFPv4 by default (and did that before this change), but after this change, only NEON/VFPv3 were available. This is being fixed properly in latest git main as part of llvm#130623 (possibly as a split out change), but any proper fix here seems to have too much potential surprises for an existing release branch.
…ts (llvm#122095)" This reverts commit 8fa0f0e. This change broke assembling for e.g. "armv7s-apple-darwin" triples, which should enable VFPv4 by default (and did that before this change), but after this change, only NEON/VFPv3 were available. This is being fixed properly in latest git main as part of llvm#130623 (possibly as a split out change), but any proper fix here seems to have too much potential surprises for an existing release branch.
…134612) Previously, `cc1as` did not consider the Features that can be included from a target's FPU. This could lead to a situation where assembly files could not compile as cc1as did not know if a feature was supported. With this change, all the features for the FPU will be passed to `cc1as` as `-target-feature` lines. By making this change, it will enable `+nosimd` to be functional, worked on in #130623, and fix a regression introduced in 8fa0f0e so armv7s-apple-darwin targets can utilise VFPv4 correctly. --------- Co-authored-by: Martin Storsjö <[email protected]>
Previously, the use of MVE or MVE.FP would be defined by using the `ARM::AEK_SIMD` identifier. SIMD relates to the Cortex-A and Cortex-R extension that enables NEON instructions, which is called MVE for Cortex-M. To enable the linking of `+simd` and `+nosimd` to `+neon` and `-neon` when using clang, MVE and MVE.FP can now be defined using a unique enum identifier, rather than using the existing identifier for SIMD. This was originally planned to be merged as part of llvm#130296 but the changes made `+nosimd` an invalid argument, which, while not having any functionality, was allowed in previous versions of LLVM. To avoid regressions being introduced, this has been combined with the fix for `+nosimd` on AArch32.
`+simd` and `+nosimd` are used to enable or disable NEON Instructions when compiling for AArch32 Targets. However, up until now, using these has not been possible. To enable this, these options are mapped to the relevant LLVM backend option (`+neon` and `-neon`) so it can be both enabled and disabled successfully by the user. Tests have been added to ensure this behaviour is maintained in the future, along with updates to existing tests as behaviour has now changed relating to the use of `+simd` and `+nosimd`. As `simd` has been mapped within the ARMTargetParser.def, support for this extension is also added for the `--print-support-extensions` command when the target is AArch32. This will print the `simd` option, along with the description that relates to the Neon feature. This previously was not possible as `simd` did not have a related Feature or Negative Feature.
To ensure NEON is activated for the ArmV8-A cores and beyond, ARM::AEK_SIMD needs to be added for the appropriate architectures. Appropriate tests have been added to support this change.
Previously, FPU features were not collected when forming a list of features for the Assembler. This also allows NEON to be pulled from the FPU support rather than having it hardcoded into the Target Parser Definitions.
9690bf6
to
0e0b4b5
Compare
I have rebased this now after #134612 has been merged. This PR is only now looking at enabling +nosimd for ARM targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. LGTM
clang/docs/ReleaseNotes.rst
Outdated
@@ -419,6 +419,9 @@ Bug Fixes to Attribute Support | |||
- No longer crashing on ``__attribute__((align_value(N)))`` during template | |||
instantiation when the function parameter type is not a pointer or reference. | |||
(#GH26612) | |||
- The ``+nosimd`` attribute is now fully supported for AArch32. Previously, this had no effect when being used with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a comment be added to the arm section too / instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved it to the ARM section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @davemgreen for the review.
clang/docs/ReleaseNotes.rst
Outdated
@@ -419,6 +419,9 @@ Bug Fixes to Attribute Support | |||
- No longer crashing on ``__attribute__((align_value(N)))`` during template | |||
instantiation when the function parameter type is not a pointer or reference. | |||
(#GH26612) | |||
- The ``+nosimd`` attribute is now fully supported for AArch32. Previously, this had no effect when being used with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved it to the ARM section.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/9905 Here is the relevant piece of the build log for the reference
|
…lvm#134612) Previously, `cc1as` did not consider the Features that can be included from a target's FPU. This could lead to a situation where assembly files could not compile as cc1as did not know if a feature was supported. With this change, all the features for the FPU will be passed to `cc1as` as `-target-feature` lines. By making this change, it will enable `+nosimd` to be functional, worked on in llvm#130623, and fix a regression introduced in 8fa0f0e so armv7s-apple-darwin targets can utilise VFPv4 correctly. --------- Co-authored-by: Martin Storsjö <[email protected]>
`+simd` and `+nosimd` are used to enable or disable NEON Instructions when compiling for ARM Targets. However, up until now, using these has not been possible. To enable this, these options are mapped to the relevant LLVM backend option (`+neon` and `-neon`) so it can be both enabled and disabled successfully by the user. Tests have been added to ensure this behaviour is maintained in the future, along with updates to existing tests as behaviour has now changed relating to the use of `+simd` and `+nosimd`. As `simd` has been mapped within the ARMTargetParser.def, support for this extension is also added for the `--print-support-extensions` command when the target is AArch32. This will print the `simd` option, along with the description that relates to the Neon feature. This previously was not possible as `simd` did not have a related Feature or Negative Feature. To make this functional as intended, MVE and MVE.FP now rely on their own Enum identifier, rather than `AEK_SIMD`. While SIMD does refer to both Neon and Helium technologies, in terms of command line options, SIMD relates to Neon. Helium relates to MVE and MVE.FP. The Enum now reflects this too.
In llvm#130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEONS is disabled in the command line, it did not disable features that have NEON as a requirement, such as Crypto or AES. To achieve this, clang will now check if SIMD has been disabled when using a NEON supported FPU. If this is the case, it will disable all features that depend on NEON. While working on this Patch, I spotted that for features that rely on NEON, when used, do not enable NEON in the Driver. This meant that when using AES for example, NEON would not be activated. NEON is required when using features such as AES, so this is now enabled when using such features.
In #130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEON is disabled in the command line, it did not disable features that depend on this, such as Crypto or AES. To achieve this, This PR does the following: - Ensure that disabling NEON (e.g., via +nosimd) also disables dependent features like Crypto and AES. - Update the driver to automatically enable NEON when enabling features that require it (e.g., AES). This fixes inconsistent behavior where features relying on NEON could be enabled without NEON itself being active, or where disabling NEON left dependent features incorrectly enabled.
) In llvm#130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEON is disabled in the command line, it did not disable features that depend on this, such as Crypto or AES. To achieve this, This PR does the following: - Ensure that disabling NEON (e.g., via +nosimd) also disables dependent features like Crypto and AES. - Update the driver to automatically enable NEON when enabling features that require it (e.g., AES). This fixes inconsistent behavior where features relying on NEON could be enabled without NEON itself being active, or where disabling NEON left dependent features incorrectly enabled.
) In llvm#130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEON is disabled in the command line, it did not disable features that depend on this, such as Crypto or AES. To achieve this, This PR does the following: - Ensure that disabling NEON (e.g., via +nosimd) also disables dependent features like Crypto and AES. - Update the driver to automatically enable NEON when enabling features that require it (e.g., AES). This fixes inconsistent behavior where features relying on NEON could be enabled without NEON itself being active, or where disabling NEON left dependent features incorrectly enabled.
) In llvm#130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEON is disabled in the command line, it did not disable features that depend on this, such as Crypto or AES. To achieve this, This PR does the following: - Ensure that disabling NEON (e.g., via +nosimd) also disables dependent features like Crypto and AES. - Update the driver to automatically enable NEON when enabling features that require it (e.g., AES). This fixes inconsistent behavior where features relying on NEON could be enabled without NEON itself being active, or where disabling NEON left dependent features incorrectly enabled.
) In llvm#130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEON is disabled in the command line, it did not disable features that depend on this, such as Crypto or AES. To achieve this, This PR does the following: - Ensure that disabling NEON (e.g., via +nosimd) also disables dependent features like Crypto and AES. - Update the driver to automatically enable NEON when enabling features that require it (e.g., AES). This fixes inconsistent behavior where features relying on NEON could be enabled without NEON itself being active, or where disabling NEON left dependent features incorrectly enabled.
) In llvm#130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEON is disabled in the command line, it did not disable features that depend on this, such as Crypto or AES. To achieve this, This PR does the following: - Ensure that disabling NEON (e.g., via +nosimd) also disables dependent features like Crypto and AES. - Update the driver to automatically enable NEON when enabling features that require it (e.g., AES). This fixes inconsistent behavior where features relying on NEON could be enabled without NEON itself being active, or where disabling NEON left dependent features incorrectly enabled.
) In llvm#130623 support was added for `+nosimd` in the clang driver. Following this PR, it was discovered that, if NEON is disabled in the command line, it did not disable features that depend on this, such as Crypto or AES. To achieve this, This PR does the following: - Ensure that disabling NEON (e.g., via +nosimd) also disables dependent features like Crypto and AES. - Update the driver to automatically enable NEON when enabling features that require it (e.g., AES). This fixes inconsistent behavior where features relying on NEON could be enabled without NEON itself being active, or where disabling NEON left dependent features incorrectly enabled.
…arwin targets (#122095)" This reverts commit 8fa0f0e. This change broke assembling for e.g. "armv7s-apple-darwin" triples, which should enable VFPv4 by default (and did that before this change), but after this change, only NEON/VFPv3 were available. This is being fixed properly in latest git main as part of llvm/llvm-project#130623 (possibly as a split out change), but any proper fix here seems to have too much potential surprises for an existing release branch.
+simd
and+nosimd
are used to enable or disable NEON Instructionswhen compiling for AArch32 Targets. However, up until now, using these
has not been possible. To enable this, these options are mapped to the
relevant LLVM backend option (
+neon
and-neon
) so it can be bothenabled and disabled successfully by the user.
Tests have been added to ensure this behaviour is maintained in the future,
along with updates to existing tests as behaviour has now changed relating
to the use of
+simd
and+nosimd
.As
simd
has been mapped within the ARMTargetParser.def, support for thisextension is also added for the
--print-support-extensions
command whenthe target is AArch32. This will print the
simd
option, along with thedescription that relates to the Neon feature. This previously was not
possible as
simd
did not have a related Feature or Negative Feature.To make this functional as intended, MVE and MVE.FP now rely on their own
Enum identifier, rather than
AEK_SIMD
. While SIMD does refer to bothNeon and Helium technologies, in terms of command line options, SIMD relates
to Neon. Helium relates to MVE and MVE.FP. The Enum now reflects this too.