Skip to content

[AArch64] Implement __builtin_cpu_supports, compiler-rt tests. #82378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 22, 2024

Conversation

ilinpv
Copy link
Contributor

@ilinpv ilinpv commented Feb 20, 2024

The patch complements #68919 and adds AArch64 support for builtin
__builtin_cpu_supports("feature1+...+featureN")
which return true if all specified CPU features in argument are detected. Also compiler-rt aarch64 native run tests for features detection mechanism were added and 'cpu_model' check was fixed after its refactor merged #75635 Original RFC was https://reviews.llvm.org/D153153

@llvmbot llvmbot added clang Clang issues not falling into any other category compiler-rt backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. compiler-rt:builtins labels Feb 20, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 20, 2024

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Pavel Iliin (ilinpv)

Changes

The patch complements #68919 and adds AArch64 support for builtin
__builtin_cpu_supports("feature1+...+featureN")
which return true if all specified CPU features in argument are detected. Also compiler-rt aarch64 native run tests for features detection mechanism were added and 'cpu_model' check was fixed after its refactor merged #75635 Original RFC was https://reviews.llvm.org/D153153


Full diff: https://github.com/llvm/llvm-project/pull/82378.diff

11 Files Affected:

  • (modified) clang/lib/Basic/Targets/AArch64.cpp (+7-1)
  • (modified) clang/lib/Basic/Targets/AArch64.h (+1-1)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+16)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+1-1)
  • (added) clang/test/CodeGen/aarch64-cpu-supports-target.c (+52)
  • (added) clang/test/CodeGen/aarch64-cpu-supports.c (+54)
  • (modified) clang/test/Preprocessor/has_builtin_cpuid.c (+1-6)
  • (added) clang/test/Sema/aarch64-cpu-supports.c (+26)
  • (modified) clang/test/Sema/builtin-cpu-supports.c (+1-1)
  • (added) compiler-rt/test/builtins/Unit/aarch64_cpu_features_test.c (+17)
  • (modified) compiler-rt/test/builtins/Unit/cpu_model_test.c (+1-1)
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 68032961451d90..5abb060073c517 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -667,7 +667,13 @@ StringRef AArch64TargetInfo::getFeatureDependencies(StringRef Name) const {
 }
 
 bool AArch64TargetInfo::validateCpuSupports(StringRef FeatureStr) const {
-  return llvm::AArch64::parseArchExtension(FeatureStr).has_value();
+  // CPU features might be separated by '+', extract them and check
+  llvm::SmallVector<StringRef, 8> Features;
+  FeatureStr.split(Features, "+");
+  for (auto &Feature : Features)
+    if (!llvm::AArch64::parseArchExtension(Feature.trim()).has_value())
+      return false;
+  return true;
 }
 
 bool AArch64TargetInfo::hasFeature(StringRef Feature) const {
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 26ee7fa1978256..c1ba156860a122 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -165,7 +165,7 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
                             DiagnosticsEngine &Diags) override;
   ParsedTargetAttr parseTargetAttr(StringRef Str) const override;
   bool supportsTargetAttributeTune() const override { return true; }
-
+  bool supportsCpuSupports() const override { return true; }
   bool checkArithmeticFenceSupported() const override { return true; }
 
   bool hasBFloat16Type() const override;
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index d454ccc1dd8613..b2af45719d00ec 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -10638,6 +10638,9 @@ Value *CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID,
       BuiltinID <= clang::AArch64::LastSMEBuiltin)
     return EmitAArch64SMEBuiltinExpr(BuiltinID, E);
 
+  if (BuiltinID == Builtin::BI__builtin_cpu_supports)
+    return EmitAArch64CpuSupports(E);
+
   unsigned HintID = static_cast<unsigned>(-1);
   switch (BuiltinID) {
   default: break;
@@ -14025,6 +14028,19 @@ Value *CodeGenFunction::EmitX86CpuInit() {
   return Builder.CreateCall(Func);
 }
 
+Value *CodeGenFunction::EmitAArch64CpuSupports(const CallExpr *E) {
+  const Expr *ArgExpr = E->getArg(0)->IgnoreParenCasts();
+  StringRef ArgStr = cast<StringLiteral>(ArgExpr)->getString();
+  llvm::SmallVector<StringRef, 8> Features;
+  ArgStr.split(Features, "+");
+  for (auto &Feature : Features) {
+    Feature = Feature.trim();
+    if (Feature != "default")
+      Features.push_back(Feature);
+  }
+  return EmitAArch64CpuSupports(Features);
+}
+
 llvm::Value *
 CodeGenFunction::EmitAArch64CpuSupports(ArrayRef<StringRef> FeaturesStrs) {
   uint64_t FeaturesMask = llvm::AArch64::getCpuSupportsMask(FeaturesStrs);
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index caa6a327550baa..92ce0edeaf9e9c 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -5013,10 +5013,10 @@ class CodeGenFunction : public CodeGenTypeCache {
   llvm::Value *EmitAArch64CpuInit();
   llvm::Value *
   FormAArch64ResolverCondition(const MultiVersionResolverOption &RO);
+  llvm::Value *EmitAArch64CpuSupports(const CallExpr *E);
   llvm::Value *EmitAArch64CpuSupports(ArrayRef<StringRef> FeatureStrs);
 };
 
-
 inline DominatingLLVMValue::saved_type
 DominatingLLVMValue::save(CodeGenFunction &CGF, llvm::Value *value) {
   if (!needsSaving(value)) return saved_type(value, false);
diff --git a/clang/test/CodeGen/aarch64-cpu-supports-target.c b/clang/test/CodeGen/aarch64-cpu-supports-target.c
new file mode 100644
index 00000000000000..e023944b24e53a
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-cpu-supports-target.c
@@ -0,0 +1,52 @@
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -emit-llvm -o - %s | FileCheck %s
+
+int check_all_feature() {
+  if (__builtin_cpu_supports("rng+flagm+flagm2+fp16fml+dotprod+sm4"))
+    return 1;
+  else if (__builtin_cpu_supports("rdm+lse+fp+simd+crc+sha1+sha2+sha3"))
+    return 2;
+  else if (__builtin_cpu_supports("aes+pmull+fp16+dit+dpb+dpb2+jscvt"))
+    return 3;
+  else if (__builtin_cpu_supports("fcma+rcpc+rcpc2+rcpc3+frintts+dgh"))
+    return 4;
+  else if (__builtin_cpu_supports("i8mm+bf16+ebf16+rpres+sve+sve-bf16"))
+    return 5;
+  else if (__builtin_cpu_supports("sve-ebf16+sve-i8mm+f32mm+f64mm"))
+    return 6;
+  else if (__builtin_cpu_supports("sve2+sve2-aes+sve2-pmull128"))
+    return 7;
+  else if (__builtin_cpu_supports("sve2-bitperm+sve2-sha3+sve2-sm4"))
+    return 8;
+  else if (__builtin_cpu_supports("sme+memtag+memtag2+memtag3+sb"))
+    return 9;
+  else if (__builtin_cpu_supports("predres+ssbs+ssbs2+bti+ls64+ls64_v"))
+    return 10;
+  else if (__builtin_cpu_supports("ls64_accdata+wfxt+sme-f64f64"))
+    return 11;
+  else if (__builtin_cpu_supports("sme-i16i64+sme2"))
+    return 12;
+  else
+    return 0;
+}
+
+// CHECK-LABEL: define dso_local i32 @neon_code() #1
+int __attribute__((target("simd"))) neon_code() { return 1; }
+
+// CHECK-LABEL: define dso_local i32 @sve_code() #2
+int __attribute__((target("sve"))) sve_code() { return 2; }
+
+// CHECK-LABEL: define dso_local i32 @code() #0
+int code() { return 3; }
+
+// CHECK-LABEL: define dso_local i32 @test_versions() #0
+int test_versions() {
+  if (__builtin_cpu_supports("sve"))
+    return sve_code();
+  else if (__builtin_cpu_supports("simd"))
+    return neon_code();
+  else
+    return code();
+}
+// CHECK: attributes #0 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
+// CHECK: attributes #1 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+neon" }
+// CHECK: attributes #2 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve" }
diff --git a/clang/test/CodeGen/aarch64-cpu-supports.c b/clang/test/CodeGen/aarch64-cpu-supports.c
new file mode 100644
index 00000000000000..872fec6827ef11
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-cpu-supports.c
@@ -0,0 +1,54 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-globals --version 2
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -emit-llvm -o - %s | FileCheck %s
+
+// CHECK: @__aarch64_cpu_features = external dso_local global { i64 }
+// CHECK-LABEL: define dso_local i32 @main
+// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[RETVAL:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 0, ptr [[RETVAL]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = and i64 [[TMP0]], 70368744177664
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 70368744177664
+// CHECK-NEXT:    [[TMP3:%.*]] = and i1 true, [[TMP2]]
+// CHECK-NEXT:    br i1 [[TMP3]], label [[IF_THEN:%.*]], label [[IF_END:%.*]]
+// CHECK:       if.then:
+// CHECK-NEXT:    store i32 1, ptr [[RETVAL]], align 4
+// CHECK-NEXT:    br label [[RETURN:%.*]]
+// CHECK:       if.end:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT:    [[TMP5:%.*]] = and i64 [[TMP4]], 9070970929152
+// CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 9070970929152
+// CHECK-NEXT:    [[TMP7:%.*]] = and i1 true, [[TMP6]]
+// CHECK-NEXT:    br i1 [[TMP7]], label [[IF_THEN1:%.*]], label [[IF_END2:%.*]]
+// CHECK:       if.then1:
+// CHECK-NEXT:    store i32 2, ptr [[RETVAL]], align 4
+// CHECK-NEXT:    br label [[RETURN]]
+// CHECK:       if.end2:
+// CHECK-NEXT:    [[TMP8:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT:    [[TMP9:%.*]] = and i64 [[TMP8]], 166633186212708352
+// CHECK-NEXT:    [[TMP10:%.*]] = icmp eq i64 [[TMP9]], 166633186212708352
+// CHECK-NEXT:    [[TMP11:%.*]] = and i1 true, [[TMP10]]
+// CHECK-NEXT:    br i1 [[TMP11]], label [[IF_THEN3:%.*]], label [[IF_END4:%.*]]
+// CHECK:       if.then3:
+// CHECK-NEXT:    store i32 3, ptr [[RETVAL]], align 4
+// CHECK-NEXT:    br label [[RETURN]]
+// CHECK:       if.end4:
+// CHECK-NEXT:    store i32 0, ptr [[RETVAL]], align 4
+// CHECK-NEXT:    br label [[RETURN]]
+// CHECK:       return:
+// CHECK-NEXT:    [[TMP12:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:    ret i32 [[TMP12]]
+//
+int main(void) {
+  if (__builtin_cpu_supports("sb"))
+    return 1;
+
+  if (__builtin_cpu_supports("sve2-pmull128+memtag"))
+    return 2;
+
+  if (__builtin_cpu_supports("sme2+ls64_v+wfxt"))
+    return 3;
+
+  return 0;
+}
diff --git a/clang/test/Preprocessor/has_builtin_cpuid.c b/clang/test/Preprocessor/has_builtin_cpuid.c
index 8de6331e62d6e7..5204220ca92864 100644
--- a/clang/test/Preprocessor/has_builtin_cpuid.c
+++ b/clang/test/Preprocessor/has_builtin_cpuid.c
@@ -12,9 +12,4 @@
 # if defined(ARM) || defined(PPC)
 #   error "ARM/PPC shouldn't have __builtin_cpu_init"
 # endif
-#endif
-#if __has_builtin(__builtin_cpu_supports)
-# ifdef ARM
-#   error "ARM shouldn't have __builtin_cpu_supports"
-# endif
-#endif
+#endif
\ No newline at end of file
diff --git a/clang/test/Sema/aarch64-cpu-supports.c b/clang/test/Sema/aarch64-cpu-supports.c
new file mode 100644
index 00000000000000..24aae9542dbc42
--- /dev/null
+++ b/clang/test/Sema/aarch64-cpu-supports.c
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 -fsyntax-only -triple aarch64-linux-gnu -verify %s
+
+int test_aarch64_features(void) {
+  char * ssbs2;
+  // expected-error@+1 {{expression is not a string literal}}
+  if (__builtin_cpu_supports(ssbs2))
+    return 1;
+  // expected-error@+1 {{invalid cpu feature string}}
+  if (__builtin_cpu_supports(""))
+    return 2;
+  // expected-error@+1 {{invalid cpu feature string}}
+  if (__builtin_cpu_supports("pmull128"))
+    return 3;
+  // expected-error@+1 {{invalid cpu feature string}}
+  if (__builtin_cpu_supports("sve2,rpres"))
+    return 4;
+  // expected-error@+1 {{invalid cpu feature string}}
+  if (__builtin_cpu_supports("dgh+sve2-pmull"))
+    return 5;
+  // expected-error@+1 {{invalid cpu feature string}}
+  if (__builtin_cpu_supports("default"))
+    return 6;
+  if (__builtin_cpu_supports(" ssbs + bti "))
+    return 7;
+  return 0;
+}
diff --git a/clang/test/Sema/builtin-cpu-supports.c b/clang/test/Sema/builtin-cpu-supports.c
index cc6f1beb5d8a7c..733d797f3ff8f8 100644
--- a/clang/test/Sema/builtin-cpu-supports.c
+++ b/clang/test/Sema/builtin-cpu-supports.c
@@ -27,7 +27,7 @@ int main(void) {
   (void)__builtin_cpu_supports("x86-64-v4");
   (void)__builtin_cpu_supports("x86-64-v5"); // expected-error {{invalid cpu feature string for builtin}}
 #else
-  if (__builtin_cpu_supports("aes")) // expected-error {{builtin is not supported on this target}}
+  if (__builtin_cpu_supports("neon")) // expected-error {{invalid cpu feature string for builtin}}
     a("vsx");
 
   if (__builtin_cpu_is("cortex-x3")) // expected-error {{builtin is not supported on this target}}
diff --git a/compiler-rt/test/builtins/Unit/aarch64_cpu_features_test.c b/compiler-rt/test/builtins/Unit/aarch64_cpu_features_test.c
new file mode 100644
index 00000000000000..7ca2710ea27567
--- /dev/null
+++ b/compiler-rt/test/builtins/Unit/aarch64_cpu_features_test.c
@@ -0,0 +1,17 @@
+// REQUIRES: aarch64-target-arch
+// REQUIRES: native-run
+// RUN: %clang_builtins %s %librt -o %t && %run %t
+// REQUIRES: librt_has_aarch64
+int main(void) {
+  if (__builtin_cpu_supports("fp+simd+pmull+sha2+crc")) {
+    if (__builtin_cpu_supports("fp") && __builtin_cpu_supports("simd") &&
+        __builtin_cpu_supports("pmull") && __builtin_cpu_supports("sha2") &&
+        __builtin_cpu_supports("crc")) {
+      return 0;
+    } else {
+      // Something wrong in feature detection
+      return 1;
+    }
+  }
+  return 0;
+}
diff --git a/compiler-rt/test/builtins/Unit/cpu_model_test.c b/compiler-rt/test/builtins/Unit/cpu_model_test.c
index a8b736802f67be..6d5f17aa125657 100644
--- a/compiler-rt/test/builtins/Unit/cpu_model_test.c
+++ b/compiler-rt/test/builtins/Unit/cpu_model_test.c
@@ -1,6 +1,6 @@
 // REQUIRES: x86-target-arch
 // RUN: %clang_builtins %s %librt -o %t && %run %t
-// REQUIRES: librt_has_cpu_model
+// REQUIRES: librt_has_x86
 
 // FIXME: XFAIL the test because it is expected to return non-zero value.
 // XFAIL: *

@ilinpv ilinpv force-pushed the builtin_cpu_supports branch from 7e4c00d to 7ada935 Compare February 20, 2024 17:28
The patch complements llvm#68919
and adds AArch64 support for builtin
__builtin_cpu_supports("feature1+...+featureN")
which return true if all specified CPU features in argument are
detected. Also compiler-rt aarch64 native run tests for features
detection mechanism were added and 'cpu_model' check was fixed after its
refactor merged llvm#75635
Original RFC was https://reviews.llvm.org/D153153
End of line added.
@ilinpv ilinpv force-pushed the builtin_cpu_supports branch from 0cf1ec0 to cb16d3a Compare February 22, 2024 22:21
@ilinpv ilinpv merged commit 568baba into llvm:main Feb 22, 2024
@asmok-g
Copy link

asmok-g commented Feb 28, 2024

Hello,

This patch makes the second line fail to compile

#elif ABSL_HAVE_BUILTIN(__builtin_cpu_supports)
  if (__builtin_cpu_supports("avx2")) {

error: invalid cpu feature string for builtin when built on arm

@DanielKristofKiss
Copy link
Member

Hello,

This patch makes the second line fail to compile

#elif ABSL_HAVE_BUILTIN(__builtin_cpu_supports)
  if (__builtin_cpu_supports("avx2")) {

error: invalid cpu feature string for builtin when built on arm

expected to fail as __builtin_cpu_supports now supported on Arm too, something like this would be better:
#elif ABSL_HAVE_BUILTIN(__builtin_cpu_supports) && (defined(__x86_64__) || defined(__i386__)

@alexfh
Copy link
Contributor

alexfh commented Feb 28, 2024

This makes the clang diverge from the GCC behavior, for example, GCC happily compiles (void)__builtin_cpu_supports("neon"); on x86, but clang doesn't after this patch: https://gcc.godbolt.org/z/a367Pd7sE

@ilinpv
Copy link
Contributor Author

ilinpv commented Feb 28, 2024

@alexfh thanks for example, clang doesn't compile that on x86 before patch as well. It looks like a bug in GCC that it happily compiles the builtin above on x86.

@eaeltsin
Copy link
Contributor

@ilinpv the problem is that a lot of third party code out there might fail to compile now because it assumes GCC's behavior for invalid __builtin_cpu_supports. At least we can see many breakages in the third party that we use in our codebase.

By intentionally diverging from GCC this way you assume third party code owners will have to add architecture-specific guards around each use of __builtin_cpu_supports, which might be pretty disruptive.

Is there any document that specifies how builtins should behave when they are not supported/their parameter is not supported? I assume there is a reason behind failing the compilation in these cases, as not every builtin can simply answer "no", but I think this should be documented very explicitly.

@DanielKristofKiss
Copy link
Member

Backward\forward compatibility force us to support any parameter in __builtin_cpu_supports .
A given codebase and toolchain add support for a new feature and everything works but we can't assume all cases the toolchain is up to date where the project is used so an older version of the compiler needs to accept the new feature string when it doesn't know about it.
Also target and such a preprocessor checks will not help in this case.

@eaeltsin
Copy link
Contributor

Thanks @DanielKristofKiss - considering compilations with older toolchains is a good reason to choose the particular behavior. It would be very valuable if all these reasons and the selected behavior for unsupported builtins/parameters are documented, i.e. here - https://clang.llvm.org/docs/LanguageExtensions.html#builtin-functions

@bgra8
Copy link
Contributor

bgra8 commented Feb 29, 2024

@DanielKristofKiss do I understand correctly that you confirm the current behavior for __builtin_cpu_supports("unsupported-features") -- does not compile no matter the CPU -- is a clang defect?

If that is the case let's file a bug to have this fixed.

@DanielKristofKiss
Copy link
Member

@bgra8 Correct, submitted #83407.

@ilinpv
Copy link
Contributor Author

ilinpv commented Mar 1, 2024

Fix on review #83515
The documentation is coming next.

@eaeltsin
Copy link
Contributor

eaeltsin commented Mar 1, 2024

Thanks @ilinpv !

@ilinpv
Copy link
Contributor Author

ilinpv commented Mar 6, 2024

Documentation update #84098

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category compiler-rt:builtins compiler-rt
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants