Skip to content

[AArch64][SME] Warn when using a streaming builtin from a non-streaming function #75487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 18, 2023

Conversation

SamTebbs33
Copy link
Collaborator

This PR adds a warning that's emitted when a non-streaming or non-streaming-compatible builtin is called in an unsuitable function.

Uses work by Kerry McLaughlin.

This is a re-upload of #74064 and fixes a compile time increase.

…ng function

This PR adds a warning that's emitted when a non-streaming or
non-streaming-compatible builtin is called in an unsuitable function.

Uses work by Kerry McLaughlin.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Dec 14, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 14, 2023

@llvm/pr-subscribers-clang

Author: Sam Tebbs (SamTebbs33)

Changes

This PR adds a warning that's emitted when a non-streaming or non-streaming-compatible builtin is called in an unsuitable function.

Uses work by Kerry McLaughlin.

This is a re-upload of #74064 and fixes a compile time increase.


Patch is 308.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/75487.diff

19 Files Affected:

  • (modified) clang/include/clang/Basic/CMakeLists.txt (+6)
  • (modified) clang/include/clang/Basic/arm_sve.td (+582-582)
  • (modified) clang/include/clang/Sema/Sema.h (+1)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+47-4)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i32.c (+8-8)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i64.c (+8-8)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za32.c (+7-7)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za64.c (+5-5)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za32.c (+7-7)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za64.c (+5-5)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c (+96-96)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_write.c (+96-96)
  • (modified) clang/test/Sema/aarch64-incompat-sm-builtin-calls.c (+77)
  • (modified) clang/test/Sema/aarch64-sme-intrinsics/acle_sme_imm.cpp (+7-7)
  • (modified) clang/test/Sema/aarch64-sme-intrinsics/acle_sme_target.c (+5-4)
  • (modified) clang/utils/TableGen/NeonEmitter.cpp (+28)
  • (modified) clang/utils/TableGen/SveEmitter.cpp (+81)
  • (modified) clang/utils/TableGen/TableGen.cpp (+12)
  • (modified) clang/utils/TableGen/TableGenBackends.h (+2)
diff --git a/clang/include/clang/Basic/CMakeLists.txt b/clang/include/clang/Basic/CMakeLists.txt
index 085e316fcc671d..73fd521aeeec31 100644
--- a/clang/include/clang/Basic/CMakeLists.txt
+++ b/clang/include/clang/Basic/CMakeLists.txt
@@ -88,6 +88,9 @@ clang_tablegen(arm_sve_typeflags.inc -gen-arm-sve-typeflags
 clang_tablegen(arm_sve_sema_rangechecks.inc -gen-arm-sve-sema-rangechecks
   SOURCE arm_sve.td
   TARGET ClangARMSveSemaRangeChecks)
+clang_tablegen(arm_sve_streaming_attrs.inc -gen-arm-sve-streaming-attrs
+  SOURCE arm_sve.td
+  TARGET ClangARMSveStreamingAttrs)
 clang_tablegen(arm_sme_builtins.inc -gen-arm-sme-builtins
   SOURCE arm_sme.td
   TARGET ClangARMSmeBuiltins)
@@ -97,6 +100,9 @@ clang_tablegen(arm_sme_builtin_cg.inc -gen-arm-sme-builtin-codegen
 clang_tablegen(arm_sme_sema_rangechecks.inc -gen-arm-sme-sema-rangechecks
   SOURCE arm_sme.td
   TARGET ClangARMSmeSemaRangeChecks)
+clang_tablegen(arm_sme_streaming_attrs.inc -gen-arm-sme-streaming-attrs
+  SOURCE arm_sme.td
+  TARGET ClangARMSmeStreamingAttrs)
 clang_tablegen(arm_cde_builtins.inc -gen-arm-cde-builtin-def
   SOURCE arm_cde.td
   TARGET ClangARMCdeBuiltinsDef)
diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index db6f17d1c493af..3df77c931998f6 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -19,27 +19,27 @@ include "arm_sve_sme_incl.td"
 // Loads
 
 // Load one vector (scalar base)
-def SVLD1   : MInst<"svld1[_{2}]", "dPc", "csilUcUsUiUlhfd", [IsLoad],               MemEltTyDefault, "aarch64_sve_ld1">;
-def SVLD1SB : MInst<"svld1sb_{d}", "dPS", "silUsUiUl",       [IsLoad],               MemEltTyInt8,    "aarch64_sve_ld1">;
-def SVLD1UB : MInst<"svld1ub_{d}", "dPW", "silUsUiUl",       [IsLoad, IsZExtReturn], MemEltTyInt8,    "aarch64_sve_ld1">;
-def SVLD1SH : MInst<"svld1sh_{d}", "dPT", "ilUiUl",          [IsLoad],               MemEltTyInt16,   "aarch64_sve_ld1">;
-def SVLD1UH : MInst<"svld1uh_{d}", "dPX", "ilUiUl",          [IsLoad, IsZExtReturn], MemEltTyInt16,   "aarch64_sve_ld1">;
-def SVLD1SW : MInst<"svld1sw_{d}", "dPU", "lUl",             [IsLoad],               MemEltTyInt32,   "aarch64_sve_ld1">;
-def SVLD1UW : MInst<"svld1uw_{d}", "dPY", "lUl",             [IsLoad, IsZExtReturn], MemEltTyInt32,   "aarch64_sve_ld1">;
+def SVLD1   : MInst<"svld1[_{2}]", "dPc", "csilUcUsUiUlhfd", [IsLoad, IsStreamingCompatible],               MemEltTyDefault, "aarch64_sve_ld1">;
+def SVLD1SB : MInst<"svld1sb_{d}", "dPS", "silUsUiUl",       [IsLoad, IsStreamingCompatible],               MemEltTyInt8,    "aarch64_sve_ld1">;
+def SVLD1UB : MInst<"svld1ub_{d}", "dPW", "silUsUiUl",       [IsLoad, IsZExtReturn, IsStreamingCompatible], MemEltTyInt8,    "aarch64_sve_ld1">;
+def SVLD1SH : MInst<"svld1sh_{d}", "dPT", "ilUiUl",          [IsLoad, IsStreamingCompatible],               MemEltTyInt16,   "aarch64_sve_ld1">;
+def SVLD1UH : MInst<"svld1uh_{d}", "dPX", "ilUiUl",          [IsLoad, IsZExtReturn, IsStreamingCompatible], MemEltTyInt16,   "aarch64_sve_ld1">;
+def SVLD1SW : MInst<"svld1sw_{d}", "dPU", "lUl",             [IsLoad, IsStreamingCompatible],               MemEltTyInt32,   "aarch64_sve_ld1">;
+def SVLD1UW : MInst<"svld1uw_{d}", "dPY", "lUl",             [IsLoad, IsZExtReturn, IsStreamingCompatible], MemEltTyInt32,   "aarch64_sve_ld1">;
 
 let TargetGuard = "sve,bf16" in {
-  def SVLD1_BF      : MInst<"svld1[_{2}]",      "dPc",  "b", [IsLoad], MemEltTyDefault, "aarch64_sve_ld1">;
-  def SVLD1_VNUM_BF : MInst<"svld1_vnum[_{2}]", "dPcl", "b", [IsLoad], MemEltTyDefault, "aarch64_sve_ld1">;
+  def SVLD1_BF      : MInst<"svld1[_{2}]",      "dPc",  "b", [IsLoad, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_ld1">;
+  def SVLD1_VNUM_BF : MInst<"svld1_vnum[_{2}]", "dPcl", "b", [IsLoad, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_ld1">;
 }
 
 // Load one vector (scalar base, VL displacement)
-def SVLD1_VNUM   : MInst<"svld1_vnum[_{2}]", "dPcl", "csilUcUsUiUlhfd", [IsLoad],               MemEltTyDefault, "aarch64_sve_ld1">;
-def SVLD1SB_VNUM : MInst<"svld1sb_vnum_{d}", "dPSl", "silUsUiUl",       [IsLoad],               MemEltTyInt8,    "aarch64_sve_ld1">;
-def SVLD1UB_VNUM : MInst<"svld1ub_vnum_{d}", "dPWl", "silUsUiUl",       [IsLoad, IsZExtReturn], MemEltTyInt8,    "aarch64_sve_ld1">;
-def SVLD1SH_VNUM : MInst<"svld1sh_vnum_{d}", "dPTl", "ilUiUl",          [IsLoad],               MemEltTyInt16,   "aarch64_sve_ld1">;
-def SVLD1UH_VNUM : MInst<"svld1uh_vnum_{d}", "dPXl", "ilUiUl",          [IsLoad, IsZExtReturn], MemEltTyInt16,   "aarch64_sve_ld1">;
-def SVLD1SW_VNUM : MInst<"svld1sw_vnum_{d}", "dPUl", "lUl",             [IsLoad],               MemEltTyInt32,   "aarch64_sve_ld1">;
-def SVLD1UW_VNUM : MInst<"svld1uw_vnum_{d}", "dPYl", "lUl",             [IsLoad, IsZExtReturn], MemEltTyInt32,   "aarch64_sve_ld1">;
+def SVLD1_VNUM   : MInst<"svld1_vnum[_{2}]", "dPcl", "csilUcUsUiUlhfd", [IsLoad, IsStreamingCompatible],               MemEltTyDefault, "aarch64_sve_ld1">;
+def SVLD1SB_VNUM : MInst<"svld1sb_vnum_{d}", "dPSl", "silUsUiUl",       [IsLoad, IsStreamingCompatible],               MemEltTyInt8,    "aarch64_sve_ld1">;
+def SVLD1UB_VNUM : MInst<"svld1ub_vnum_{d}", "dPWl", "silUsUiUl",       [IsLoad, IsZExtReturn, IsStreamingCompatible], MemEltTyInt8,    "aarch64_sve_ld1">;
+def SVLD1SH_VNUM : MInst<"svld1sh_vnum_{d}", "dPTl", "ilUiUl",          [IsLoad, IsStreamingCompatible],               MemEltTyInt16,   "aarch64_sve_ld1">;
+def SVLD1UH_VNUM : MInst<"svld1uh_vnum_{d}", "dPXl", "ilUiUl",          [IsLoad, IsZExtReturn, IsStreamingCompatible], MemEltTyInt16,   "aarch64_sve_ld1">;
+def SVLD1SW_VNUM : MInst<"svld1sw_vnum_{d}", "dPUl", "lUl",             [IsLoad, IsStreamingCompatible],               MemEltTyInt32,   "aarch64_sve_ld1">;
+def SVLD1UW_VNUM : MInst<"svld1uw_vnum_{d}", "dPYl", "lUl",             [IsLoad, IsZExtReturn, IsStreamingCompatible], MemEltTyInt32,   "aarch64_sve_ld1">;
 
 // Load one vector (vector base)
 def SVLD1_GATHER_BASES_U   : MInst<"svld1_gather[_{2}base]_{d}",   "dPu", "ilUiUlfd", [IsGatherLoad],               MemEltTyDefault, "aarch64_sve_ld1_gather_scalar_offset">;
@@ -243,27 +243,27 @@ let TargetGuard = "sve,bf16" in {
 }
 
 // Load one vector, unextended load, non-temporal (scalar base)
-def SVLDNT1 : MInst<"svldnt1[_{2}]", "dPc", "csilUcUsUiUlhfd", [IsLoad], MemEltTyDefault, "aarch64_sve_ldnt1">;
+def SVLDNT1 : MInst<"svldnt1[_{2}]", "dPc", "csilUcUsUiUlhfd", [IsLoad, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_ldnt1">;
 
 // Load one vector, unextended load, non-temporal (scalar base, VL displacement)
-def SVLDNT1_VNUM : MInst<"svldnt1_vnum[_{2}]", "dPcl", "csilUcUsUiUlhfd", [IsLoad], MemEltTyDefault, "aarch64_sve_ldnt1">;
+def SVLDNT1_VNUM : MInst<"svldnt1_vnum[_{2}]", "dPcl", "csilUcUsUiUlhfd", [IsLoad, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_ldnt1">;
 
 let TargetGuard = "sve,bf16" in {
-  def SVLDNT1_BF      : MInst<"svldnt1[_{2}]",      "dPc",  "b", [IsLoad], MemEltTyDefault, "aarch64_sve_ldnt1">;
-  def SVLDNT1_VNUM_BF : MInst<"svldnt1_vnum[_{2}]", "dPcl", "b", [IsLoad], MemEltTyDefault, "aarch64_sve_ldnt1">;
+  def SVLDNT1_BF      : MInst<"svldnt1[_{2}]",      "dPc",  "b", [IsLoad, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_ldnt1">;
+  def SVLDNT1_VNUM_BF : MInst<"svldnt1_vnum[_{2}]", "dPcl", "b", [IsLoad, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_ldnt1">;
 }
 
 // Load one quadword and replicate (scalar base)
-def SVLD1RQ : SInst<"svld1rq[_{2}]", "dPc", "csilUcUsUiUlhfd", MergeNone, "aarch64_sve_ld1rq">;
+def SVLD1RQ : SInst<"svld1rq[_{2}]", "dPc", "csilUcUsUiUlhfd", MergeNone, "aarch64_sve_ld1rq", [IsStreamingCompatible]>;
 
 let TargetGuard = "sve,bf16" in {
-  def SVLD1RQ_BF : SInst<"svld1rq[_{2}]", "dPc",  "b", MergeNone, "aarch64_sve_ld1rq">;
+  def SVLD1RQ_BF : SInst<"svld1rq[_{2}]", "dPc",  "b", MergeNone, "aarch64_sve_ld1rq", [IsStreamingCompatible]>;
 }
 
 multiclass StructLoad<string name, string proto, string i> {
-  def : SInst<name, proto, "csilUcUsUiUlhfd", MergeNone, i, [IsStructLoad]>;
+  def : SInst<name, proto, "csilUcUsUiUlhfd", MergeNone, i, [IsStructLoad, IsStreamingCompatible]>;
   let TargetGuard = "sve,bf16" in {
-    def: SInst<name, proto, "b", MergeNone, i, [IsStructLoad]>;
+    def: SInst<name, proto, "b", MergeNone, i, [IsStructLoad, IsStreamingCompatible]>;
   }
 }
 
@@ -286,16 +286,16 @@ let TargetGuard = "sve,f64mm,bf16" in {
 }
 
 let TargetGuard = "sve,bf16" in {
-  def SVBFDOT        : SInst<"svbfdot[_{0}]",        "MMdd",  "b", MergeNone, "aarch64_sve_bfdot",        [IsOverloadNone]>;
-  def SVBFMLALB      : SInst<"svbfmlalb[_{0}]",      "MMdd",  "b", MergeNone, "aarch64_sve_bfmlalb",      [IsOverloadNone]>;
-  def SVBFMLALT      : SInst<"svbfmlalt[_{0}]",      "MMdd",  "b", MergeNone, "aarch64_sve_bfmlalt",      [IsOverloadNone]>;
-  def SVBFMMLA       : SInst<"svbfmmla[_{0}]",       "MMdd",  "b", MergeNone, "aarch64_sve_bfmmla",       [IsOverloadNone]>;
-  def SVBFDOT_N      : SInst<"svbfdot[_n_{0}]",      "MMda",  "b", MergeNone, "aarch64_sve_bfdot",        [IsOverloadNone]>;
-  def SVBFMLAL_N     : SInst<"svbfmlalb[_n_{0}]",    "MMda",  "b", MergeNone, "aarch64_sve_bfmlalb",      [IsOverloadNone]>;
-  def SVBFMLALT_N    : SInst<"svbfmlalt[_n_{0}]",    "MMda",  "b", MergeNone, "aarch64_sve_bfmlalt",      [IsOverloadNone]>;
-  def SVBFDOT_LANE   : SInst<"svbfdot_lane[_{0}]",   "MMddi", "b", MergeNone, "aarch64_sve_bfdot_lane_v2",   [IsOverloadNone], [ImmCheck<3, ImmCheck0_3>]>;
-  def SVBFMLALB_LANE : SInst<"svbfmlalb_lane[_{0}]", "MMddi", "b", MergeNone, "aarch64_sve_bfmlalb_lane_v2", [IsOverloadNone], [ImmCheck<3, ImmCheck0_7>]>;
-  def SVBFMLALT_LANE : SInst<"svbfmlalt_lane[_{0}]", "MMddi", "b", MergeNone, "aarch64_sve_bfmlalt_lane_v2", [IsOverloadNone], [ImmCheck<3, ImmCheck0_7>]>;
+  def SVBFDOT        : SInst<"svbfdot[_{0}]",        "MMdd",  "b", MergeNone, "aarch64_sve_bfdot",        [IsOverloadNone, IsStreamingCompatible]>;
+  def SVBFMLALB      : SInst<"svbfmlalb[_{0}]",      "MMdd",  "b", MergeNone, "aarch64_sve_bfmlalb",      [IsOverloadNone, IsStreamingCompatible]>;
+  def SVBFMLALT      : SInst<"svbfmlalt[_{0}]",      "MMdd",  "b", MergeNone, "aarch64_sve_bfmlalt",      [IsOverloadNone, IsStreamingCompatible]>;
+  def SVBFMMLA       : SInst<"svbfmmla[_{0}]",       "MMdd",  "b", MergeNone, "aarch64_sve_bfmmla",       [IsOverloadNone, IsStreamingCompatible]>;
+  def SVBFDOT_N      : SInst<"svbfdot[_n_{0}]",      "MMda",  "b", MergeNone, "aarch64_sve_bfdot",        [IsOverloadNone, IsStreamingCompatible]>;
+  def SVBFMLAL_N     : SInst<"svbfmlalb[_n_{0}]",    "MMda",  "b", MergeNone, "aarch64_sve_bfmlalb",      [IsOverloadNone, IsStreamingCompatible]>;
+  def SVBFMLALT_N    : SInst<"svbfmlalt[_n_{0}]",    "MMda",  "b", MergeNone, "aarch64_sve_bfmlalt",      [IsOverloadNone, IsStreamingCompatible]>;
+  def SVBFDOT_LANE   : SInst<"svbfdot_lane[_{0}]",   "MMddi", "b", MergeNone, "aarch64_sve_bfdot_lane_v2",   [IsOverloadNone, IsStreamingCompatible], [ImmCheck<3, ImmCheck0_3>]>;
+  def SVBFMLALB_LANE : SInst<"svbfmlalb_lane[_{0}]", "MMddi", "b", MergeNone, "aarch64_sve_bfmlalb_lane_v2", [IsOverloadNone, IsStreamingCompatible], [ImmCheck<3, ImmCheck0_7>]>;
+  def SVBFMLALT_LANE : SInst<"svbfmlalt_lane[_{0}]", "MMddi", "b", MergeNone, "aarch64_sve_bfmlalt_lane_v2", [IsOverloadNone, IsStreamingCompatible], [ImmCheck<3, ImmCheck0_7>]>;
 }
 
 let TargetGuard = "sve2p1" in {
@@ -334,26 +334,26 @@ let TargetGuard = "sve2p1" in {
 // Stores
 
 // Store one vector (scalar base)
-def SVST1    : MInst<"svst1[_{d}]",  "vPpd", "csilUcUsUiUlhfd", [IsStore], MemEltTyDefault, "aarch64_sve_st1">;
-def SVST1B_S : MInst<"svst1b[_{d}]", "vPAd", "sil",             [IsStore], MemEltTyInt8,    "aarch64_sve_st1">;
-def SVST1B_U : MInst<"svst1b[_{d}]", "vPEd", "UsUiUl",          [IsStore], MemEltTyInt8,    "aarch64_sve_st1">;
-def SVST1H_S : MInst<"svst1h[_{d}]", "vPBd", "il",              [IsStore], MemEltTyInt16,   "aarch64_sve_st1">;
-def SVST1H_U : MInst<"svst1h[_{d}]", "vPFd", "UiUl",            [IsStore], MemEltTyInt16,   "aarch64_sve_st1">;
-def SVST1W_S : MInst<"svst1w[_{d}]", "vPCd", "l",               [IsStore], MemEltTyInt32,   "aarch64_sve_st1">;
-def SVST1W_U : MInst<"svst1w[_{d}]", "vPGd", "Ul",              [IsStore], MemEltTyInt32,   "aarch64_sve_st1">;
+def SVST1    : MInst<"svst1[_{d}]",  "vPpd", "csilUcUsUiUlhfd", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_st1">;
+def SVST1B_S : MInst<"svst1b[_{d}]", "vPAd", "sil",             [IsStore, IsStreamingCompatible], MemEltTyInt8,    "aarch64_sve_st1">;
+def SVST1B_U : MInst<"svst1b[_{d}]", "vPEd", "UsUiUl",          [IsStore, IsStreamingCompatible], MemEltTyInt8,    "aarch64_sve_st1">;
+def SVST1H_S : MInst<"svst1h[_{d}]", "vPBd", "il",              [IsStore, IsStreamingCompatible], MemEltTyInt16,   "aarch64_sve_st1">;
+def SVST1H_U : MInst<"svst1h[_{d}]", "vPFd", "UiUl",            [IsStore, IsStreamingCompatible], MemEltTyInt16,   "aarch64_sve_st1">;
+def SVST1W_S : MInst<"svst1w[_{d}]", "vPCd", "l",               [IsStore, IsStreamingCompatible], MemEltTyInt32,   "aarch64_sve_st1">;
+def SVST1W_U : MInst<"svst1w[_{d}]", "vPGd", "Ul",              [IsStore, IsStreamingCompatible], MemEltTyInt32,   "aarch64_sve_st1">;
 
 // Store one vector (scalar base, VL displacement)
-def SVST1_VNUM    : MInst<"svst1_vnum[_{d}]",  "vPpld", "csilUcUsUiUlhfd", [IsStore], MemEltTyDefault, "aarch64_sve_st1">;
-def SVST1B_VNUM_S : MInst<"svst1b_vnum[_{d}]", "vPAld", "sil",             [IsStore], MemEltTyInt8,    "aarch64_sve_st1">;
-def SVST1B_VNUM_U : MInst<"svst1b_vnum[_{d}]", "vPEld", "UsUiUl",          [IsStore], MemEltTyInt8,    "aarch64_sve_st1">;
-def SVST1H_VNUM_S : MInst<"svst1h_vnum[_{d}]", "vPBld", "il",              [IsStore], MemEltTyInt16,   "aarch64_sve_st1">;
-def SVST1H_VNUM_U : MInst<"svst1h_vnum[_{d}]", "vPFld", "UiUl",            [IsStore], MemEltTyInt16,   "aarch64_sve_st1">;
-def SVST1W_VNUM_S : MInst<"svst1w_vnum[_{d}]", "vPCld", "l",               [IsStore], MemEltTyInt32,   "aarch64_sve_st1">;
-def SVST1W_VNUM_U : MInst<"svst1w_vnum[_{d}]", "vPGld", "Ul",              [IsStore], MemEltTyInt32,   "aarch64_sve_st1">;
+def SVST1_VNUM    : MInst<"svst1_vnum[_{d}]",  "vPpld", "csilUcUsUiUlhfd", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_st1">;
+def SVST1B_VNUM_S : MInst<"svst1b_vnum[_{d}]", "vPAld", "sil",             [IsStore, IsStreamingCompatible], MemEltTyInt8,    "aarch64_sve_st1">;
+def SVST1B_VNUM_U : MInst<"svst1b_vnum[_{d}]", "vPEld", "UsUiUl",          [IsStore, IsStreamingCompatible], MemEltTyInt8,    "aarch64_sve_st1">;
+def SVST1H_VNUM_S : MInst<"svst1h_vnum[_{d}]", "vPBld", "il",              [IsStore, IsStreamingCompatible], MemEltTyInt16,   "aarch64_sve_st1">;
+def SVST1H_VNUM_U : MInst<"svst1h_vnum[_{d}]", "vPFld", "UiUl",            [IsStore, IsStreamingCompatible], MemEltTyInt16,   "aarch64_sve_st1">;
+def SVST1W_VNUM_S : MInst<"svst1w_vnum[_{d}]", "vPCld", "l",               [IsStore, IsStreamingCompatible], MemEltTyInt32,   "aarch64_sve_st1">;
+def SVST1W_VNUM_U : MInst<"svst1w_vnum[_{d}]", "vPGld", "Ul",              [IsStore, IsStreamingCompatible], MemEltTyInt32,   "aarch64_sve_st1">;
 
 let TargetGuard = "sve,bf16" in {
-  def SVST1_BF      : MInst<"svst1[_{d}]",      "vPpd",  "b", [IsStore], MemEltTyDefault, "aarch64_sve_st1">;
-  def SVST1_VNUM_BF : MInst<"svst1_vnum[_{d}]", "vPpld", "b", [IsStore], MemEltTyDefault, "aarch64_sve_st1">;
+  def SVST1_BF      : MInst<"svst1[_{d}]",      "vPpd",  "b", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_st1">;
+  def SVST1_VNUM_BF : MInst<"svst1_vnum[_{d}]", "vPpld", "b", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_st1">;
 }
 
 // Store one vector (vector base)
@@ -426,9 +426,9 @@ def SVST1H_SCATTER_INDEX_S    : MInst<"svst1h_scatter[_{2}base]_index[_{d}]", "v
 def SVST1W_SCATTER_INDEX_S    : MInst<"svst1w_scatter[_{2}base]_index[_{d}]", "vPuld", "lUl",      [IsScatterStore], MemEltTyInt32,   "aarch64_sve_st1_scatter_scalar_offset">;
 
 multiclass StructStore<string name, string proto, string i> {
-  def : SInst<name, proto, "csilUcUsUiUlhfd", MergeNone, i, [IsStructStore]>;
+  def : SInst<name, proto, "csilUcUsUiUlhfd", MergeNone, i, [IsStructStore, IsStreamingCompatible]>;
   let TargetGuard = "sve,bf16" in {
-    def: SInst<name, proto, "b", MergeNone, i, [IsStructStore]>;
+    def: SInst<name, proto, "b", MergeNone, i, [IsStructStore, IsStreamingCompatible]>;
   }
 }
 // Store N vectors into N-element structure (scalar base)
@@ -442,14 +442,14 @@ defm SVST3_VNUM : StructStore<"svst3_vnum[_{d}]", "vPpl3", "aarch64_sve_st3">;
 defm SVST4_VNUM : StructStore<"svst4_vnum[_{d}]", "vPpl4", "aarch64_sve_st4">;
 
 // Store one vector, with no truncation, non-temporal (scalar base)
-def SVSTNT1 : MInst<"svstnt1[_{d}]", "vPpd", "csilUcUsUiUlhfd", [IsStore], MemEltTyDefault, "aarch64_sve_stnt1">;
+def SVSTNT1 : MInst<"svstnt1[_{d}]", "vPpd", "csilUcUsUiUlhfd", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_stnt1">;
 
 // Store one vector, with no truncation, non-temporal (scalar base, VL displacement)
-def SVSTNT1_VNUM : MInst<"svstnt1_vnum[_{d}]", "vPpld", "csilUcUsUiUlhfd", [IsStore], MemEltTyDefault, "aarch64_sve_stnt1">;
+def SVSTNT1_VNUM : MInst<"svstnt1_vnum[_{d}]", "vPpld", "csilUcUsUiUlhfd", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_stnt1">;
 
 let TargetGuard = "sve,bf16" in {
-  def SVSTNT1_BF      : MInst<"svstnt1[_{d}]",      "vPpd",  "b", [IsStore], MemEltTyDefault, "aarch64_sve_stnt1">;
-  def SVSTNT1_VNUM_BF : MInst<"svstnt1_vnum[_{d}]", "vPpld", "b", [IsStore], MemEltTyDefault, "aarch64_sve_stnt1">;
+  def SVSTNT1_BF      : MInst<"svstnt1[_{d}]",      "vPpd",  "b", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_stnt1">;
+  def SVSTNT1_VNUM_BF : MInst<"svstnt1_vnum[_{d}]", "vPpld", "b", [IsStore, IsStreamingCompatible], MemEltTyDefault, "aarch64_sve_stnt1">;
 }
 
 let TargetGuard = "sve2p1" in {
@@ -488,16 +488,16 @@ let TargetGuard = "sve2p1" in {
 // Prefetches
 
 // Prefetch (Scalar base)
-def SVPRFB : MInst<"svprfb", "vPQJ", "c", [IsPrefetch], MemEltTyInt8,  "aarch64_sve_prf">;
-def SVPRFH : MInst<"svprfh", "vPQJ", "s", [IsPrefetch], MemEltTyInt16, "aarch64_sve_prf">;
-def SVPRFW : MInst<"svprfw", "vPQJ", "i", [IsPrefetch], MemEltTyInt32, "aarch64_sve_prf">;
-def SVPRFD : MInst<"svprfd", "vPQJ", "l", [IsPrefetch], MemEltTyInt64, "aarch64_sve_prf">;
+def SVPRFB : MInst<"svprfb", "vPQJ", "c", [IsPrefetch, IsStreamingCompatible], MemEltTyInt8,  "aarch64_sve_prf">;
+def SVPRFH : MInst<"svprfh", "vPQJ", "s", [IsPrefetch, IsStreamingCompatible], MemEltTyInt16, "aarch64_sve_prf">;
+def SVPRFW : MInst<"svprfw", "vPQJ", "i", [IsPrefetch, IsStreamingCompatible], MemEltTyInt32, "aarch64_sve_prf">;
+def SVPRFD : MInst<"svprfd", "vPQJ", "l", [IsPrefetch, IsStreamingCompatible], MemEltTyInt64, "aarch64_sve_prf">;
 
 // Prefetch (Scalar base, VL displacement)
-def SVPRFB_VNUM : MInst<"svprfb_vnum", "vPQlJ", "c", [IsPrefetch], MemEltTyInt8,  "aarch64_sve_prf">;
-def SVPRFH_VNUM : MInst<"svprfh_vnum", "vPQlJ", "s", [IsPrefetch], MemEltTyInt16, "aarch64_sve_prf">;
-def SVPRFW_VNUM : MInst<"svprfw_vnum", "vPQlJ", "i", [IsPrefetch], MemEltTyInt32, "aarch64_sve_prf">;
-def SVPRFD_VNUM : MInst<"svprfd_vnum", "vPQlJ", "l", [IsPrefetch], MemEltTyInt64, "aarch64_sve_prf">;
+def SVPRFB_VNUM : MInst<"svprfb_vnum", "vPQlJ", "c", [IsPrefetch, IsStreamingCompatible], MemEltTyInt8,  "aarch64_sve_prf">;
+def SVPRFH_VNUM : MInst<"svprfh_vnum", "vPQlJ", "s", [IsPrefetch, IsStreamingCompatible], MemEltTyInt16, "aarch64_sve_prf">;
+def SVPRFW_VNUM : MInst<"svprfw_vnum", "vPQlJ", "i", [IsPrefetch, IsStreamingCompatible], MemEltTyInt32, "aarch64_sve_prf">;
+def SVPRFD_VNUM : MInst<"svprfd_vnum", "vPQlJ", "l", [IsPrefetch, IsStreamingCompatible], MemEltTyInt64, "aarch64_sve_prf">;
 
 // Prefetch (Vector bases)
 def SVPRFB_GATHER_BASES : MInst<"svprfb_gather[_{2}base]", "vPdJ", "UiUl", [IsGatherPrefetch], MemEltTyInt8,  "aarch64_sve_prfb_gather_scalar_offset">;
@@ -552,9 +552,9 @@ def SVDUPQ_32 : SInst<"svdupq[_n]_{d}", "dssss",  "iUif", MergeNone>;
 def SVDUPQ_64 : SInst<"svdupq[_n]_{d}", "dss",  "lUld", MergeNone>;
 
 multiclass svdup_base<string n, string p, MergeType mt, string i> {
-  def NAME : SInst<n, p, "csilUcUsUiUlhfd", mt, i>;
+  def NAME : SInst<n, p, "csilUcUsU...
[truncated]

@SamTebbs33
Copy link
Collaborator Author

SamTebbs33 commented Dec 14, 2023

Commit 6dc8fa2 is the only one with new content. d9a8da1 is the exact same as what was reverted. The compile time fix reduces the size of the generated file from 623kb to 99kb.

Copy link

github-actions bot commented Dec 14, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 2952bc3384412ca67fd1dcd2eac595088d692802 99b2d00b07b15832c8b0ed5a2523fa69624bd098 -- clang/include/clang/Sema/Sema.h clang/lib/Sema/SemaChecking.cpp clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i32.c clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i64.c clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za32.c clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za64.c clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za32.c clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za64.c clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_write.c clang/test/Sema/aarch64-incompat-sm-builtin-calls.c clang/test/Sema/aarch64-sme-intrinsics/acle_sme_imm.cpp clang/test/Sema/aarch64-sme-intrinsics/acle_sme_target.c clang/utils/TableGen/NeonEmitter.cpp clang/utils/TableGen/SveEmitter.cpp clang/utils/TableGen/TableGen.cpp clang/utils/TableGen/TableGenBackends.h
View the diff from clang-format here.
diff --git a/clang/utils/TableGen/TableGen.cpp b/clang/utils/TableGen/TableGen.cpp
index 9043d90d7c..24f4676557 100644
--- a/clang/utils/TableGen/TableGen.cpp
+++ b/clang/utils/TableGen/TableGen.cpp
@@ -288,11 +288,14 @@ cl::opt<ActionType> Action(
                    "Generate riscv_vector_builtin_cg.inc for clang"),
         clEnumValN(GenRISCVVectorBuiltinSema, "gen-riscv-vector-builtin-sema",
                    "Generate riscv_vector_builtin_sema.inc for clang"),
-        clEnumValN(GenRISCVSiFiveVectorBuiltins, "gen-riscv-sifive-vector-builtins",
+        clEnumValN(GenRISCVSiFiveVectorBuiltins,
+                   "gen-riscv-sifive-vector-builtins",
                    "Generate riscv_sifive_vector_builtins.inc for clang"),
-        clEnumValN(GenRISCVSiFiveVectorBuiltinCG, "gen-riscv-sifive-vector-builtin-codegen",
+        clEnumValN(GenRISCVSiFiveVectorBuiltinCG,
+                   "gen-riscv-sifive-vector-builtin-codegen",
                    "Generate riscv_sifive_vector_builtin_cg.inc for clang"),
-        clEnumValN(GenRISCVSiFiveVectorBuiltinSema, "gen-riscv-sifive-vector-builtin-sema",
+        clEnumValN(GenRISCVSiFiveVectorBuiltinSema,
+                   "gen-riscv-sifive-vector-builtin-sema",
                    "Generate riscv_sifive_vector_builtin_sema.inc for clang"),
         clEnumValN(GenAttrDocs, "gen-attr-docs",
                    "Generate attribute documentation"),

@nico
Copy link
Contributor

nico commented Dec 14, 2023

If it's not too much trouble, could you git cherry-pick c60663d128f8e0dccd418bdf16ecc403b96aa74a into this? (Cool if not, ofc.)

@SamTebbs33
Copy link
Collaborator Author

If it's not too much trouble, could you git cherry-pick c60663d128f8e0dccd418bdf16ecc403b96aa74a into this? (Cool if not, ofc.)

Sure, I can do that. Would you also mind attempting to reproduce the compile time with the latest commit, just to make sure it fixes the issue on your end? I saw a 36 -> 31 second (compared to 30 on main) decrease which seems OK but isn't on the same scale as the numbers you're seeing.

@sdesmalen-arm
Copy link
Collaborator

If it's not too much trouble, could you git cherry-pick c60663d128f8e0dccd418bdf16ecc403b96aa74a into this? (Cool if not, ofc.)

Sure, I can do that. Would you also mind attempting to reproduce the compile time with the latest commit, just to make sure it fixes the issue on your end? I saw a 36 -> 31 second (compared to 30 on main) decrease which seems OK but isn't on the same scale as the numbers you're seeing.

FWIW, on my machine (using -O3) compile time of SemaChecking.cpp goes from 25s (before this patch) -> 25.4s (after this patch), whereas before I saw a similar order of magnitude slowdown as @nico.

Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the tests pass after changing BuiltinType to a std::optional, the patch looks fine to me.

@SamTebbs33 SamTebbs33 merged commit 945c645 into llvm:main Dec 18, 2023
@SamTebbs33 SamTebbs33 deleted the sme-attrs branch December 18, 2023 09:32
@SamTebbs33
Copy link
Collaborator Author

@nico I tried to cherry-pick your commit (c60663d) into my branch but for some reason it was always empty. It might be best for you to try recommitting it now.

@nikic
Copy link
Contributor

nikic commented Dec 18, 2023

When building clang with clang, the regression on SemaChecking.cpp is now "only" 60% in terms of instructions retired (plus 0.4% during thin link, which is another ~50% in terms of SemaChecking).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants