Skip to content

[cuda][HIP] __constant__ should imply constant #110182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 29, 2024

Conversation

AlexVlx
Copy link
Contributor

@AlexVlx AlexVlx commented Sep 26, 2024

Currently, __constant__ variables do not get unconditionally marked as constant in IR, which seems a bit odd given their definition. This is generally inconsequential for NVPTX/AMDGPU, since said variables get emitted in the constant address space for those BEs. However, it is potentially significant for e.g. HIP-on-SPIR-V cases, as SPIR-V does not allow casts to/from the constant AS (UniformConstant), which forces __constant__ variables to be emitted in the global AS, thus making IR constness meaningful.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU labels Sep 26, 2024
@AlexVlx AlexVlx requested a review from yxsamliu September 26, 2024 22:30
@llvmbot llvmbot added the clang:codegen IR generation bugs: mangling, exceptions, etc. label Sep 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 26, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Alex Voicu (AlexVlx)

Changes

Currently, __constant__ variables do not get unconditionally marked as constant in IR, which seems a bit odd given their definition. This is generally inconsequential for NVPTX/AMDGPU, since said variables get emitted in the constant address space for those BEs. However, it is potentially significant for e.g. HIP-on-SPIR-V cases, as SPIR-V does not allow casts to/from the constant AS (UniformConstant), which forces __constant__ variables to be emitted in the global AS, thus making IR constness meaningful.


Full diff: https://github.com/llvm/llvm-project/pull/110182.diff

11 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+3-2)
  • (modified) clang/test/CodeGenCUDA/address-spaces.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/amdgpu-visibility.cu (+3-3)
  • (modified) clang/test/CodeGenCUDA/anon-ns.cu (+2-2)
  • (modified) clang/test/CodeGenCUDA/device-var-init.cu (+12-12)
  • (modified) clang/test/CodeGenCUDA/device-var-linkage.cu (+4-4)
  • (modified) clang/test/CodeGenCUDA/filter-decl.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/static-device-var-no-rdc.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/static-device-var-rdc.cu (+2-2)
  • (modified) clang/test/CodeGenCUDA/template-class-static-member.cu (+1-1)
  • (modified) clang/test/CodeGenHIP/hipspv-addr-spaces.cpp (+1-1)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 2381fa93e23fea..25c1c496a4f27f 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -5622,8 +5622,9 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D,
     emitter->finalize(GV);
 
   // If it is safe to mark the global 'constant', do so now.
-  GV->setConstant(!NeedsGlobalCtor && !NeedsGlobalDtor &&
-                  D->getType().isConstantStorage(getContext(), true, true));
+  GV->setConstant((D->hasAttr<CUDAConstantAttr>() && LangOpts.CUDAIsDevice) ||
+                  (!NeedsGlobalCtor && !NeedsGlobalDtor &&
+                   D->getType().isConstantStorage(getContext(), true, true)));
 
   // If it is in a read-only section, mark it 'constant'.
   if (const SectionAttr *SA = D->getAttr<SectionAttr>()) {
diff --git a/clang/test/CodeGenCUDA/address-spaces.cu b/clang/test/CodeGenCUDA/address-spaces.cu
index 0608c9cabd0489..66903c81b93339 100644
--- a/clang/test/CodeGenCUDA/address-spaces.cu
+++ b/clang/test/CodeGenCUDA/address-spaces.cu
@@ -9,7 +9,7 @@
 // CHECK: @i ={{.*}} addrspace(1) externally_initialized global
 __device__ int i;
 
-// CHECK: @j ={{.*}} addrspace(4) externally_initialized global
+// CHECK: @j ={{.*}} addrspace(4) externally_initialized constant
 __constant__ int j;
 
 // CHECK: @k ={{.*}} addrspace(3) global
diff --git a/clang/test/CodeGenCUDA/amdgpu-visibility.cu b/clang/test/CodeGenCUDA/amdgpu-visibility.cu
index d7dbab112a68c6..ef74d932ee8c8f 100644
--- a/clang/test/CodeGenCUDA/amdgpu-visibility.cu
+++ b/clang/test/CodeGenCUDA/amdgpu-visibility.cu
@@ -4,11 +4,11 @@
 
 #include "Inputs/cuda.h"
 
-// CHECK-DEFAULT: @c ={{.*}} addrspace(4) externally_initialized global
+// CHECK-DEFAULT: @c ={{.*}} addrspace(4) externally_initialized constant
 // CHECK-DEFAULT: @g ={{.*}} addrspace(1) externally_initialized global
-// CHECK-PROTECTED: @c = protected addrspace(4) externally_initialized global
+// CHECK-PROTECTED: @c = protected addrspace(4) externally_initialized constant
 // CHECK-PROTECTED: @g = protected addrspace(1) externally_initialized global
-// CHECK-HIDDEN: @c = protected addrspace(4) externally_initialized global
+// CHECK-HIDDEN: @c = protected addrspace(4) externally_initialized constant
 // CHECK-HIDDEN: @g = protected addrspace(1) externally_initialized global
 __constant__ int c;
 __device__ int g;
diff --git a/clang/test/CodeGenCUDA/anon-ns.cu b/clang/test/CodeGenCUDA/anon-ns.cu
index 3c55e9907dd6c1..d931f31d0207c5 100644
--- a/clang/test/CodeGenCUDA/anon-ns.cu
+++ b/clang/test/CodeGenCUDA/anon-ns.cu
@@ -28,13 +28,13 @@
 // HIP-DAG: define weak_odr {{.*}}void @[[KTX:_Z2ktIN12_GLOBAL__N_11XEEvT_\.intern\.b04fd23c98500190]](
 // HIP-DAG: define weak_odr {{.*}}void @[[KTL:_Z2ktIN12_GLOBAL__N_1UlvE_EEvT_\.intern\.b04fd23c98500190]](
 // HIP-DAG: @[[VM:_ZN12_GLOBAL__N_12vmE\.static\.b04fd23c98500190]] = addrspace(1) externally_initialized global
-// HIP-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE\.static\.b04fd23c98500190]] = addrspace(4) externally_initialized global
+// HIP-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE\.static\.b04fd23c98500190]] = addrspace(4) externally_initialized constant
 // HIP-DAG: @[[VT:_Z2vtIN12_GLOBAL__N_11XEE\.static\.b04fd23c98500190]] = addrspace(1) externally_initialized global
 
 // CUDA-DAG: define weak_odr {{.*}}void @[[KERN:_ZN12_GLOBAL__N_16kernelEv__intern__b04fd23c98500190]](
 // CUDA-DAG: define weak_odr {{.*}}void @[[KTX:_Z2ktIN12_GLOBAL__N_11XEEvT___intern__b04fd23c98500190]](
 // CUDA-DAG: define weak_odr {{.*}}void @[[KTL:_Z2ktIN12_GLOBAL__N_1UlvE_EEvT___intern__b04fd23c98500190]](
-// CUDA-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE__static__b04fd23c98500190]] = addrspace(4) externally_initialized global
+// CUDA-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE__static__b04fd23c98500190]] = addrspace(4) externally_initialized constant
 // CUDA-DAG: @[[VT:_Z2vtIN12_GLOBAL__N_11XEE__static__b04fd23c98500190]] = addrspace(1) externally_initialized global
 
 // COMMON-DAG: @_ZN12_GLOBAL__N_12vdE = internal addrspace(1) global
diff --git a/clang/test/CodeGenCUDA/device-var-init.cu b/clang/test/CodeGenCUDA/device-var-init.cu
index 226b7e295f4b45..9d62e4126b430d 100644
--- a/clang/test/CodeGenCUDA/device-var-init.cu
+++ b/clang/test/CodeGenCUDA/device-var-init.cu
@@ -26,7 +26,7 @@ __shared__ int s_v;
 // DEVICE: @s_v ={{.*}} addrspace(3) global i32 undef,
 // HOST:   @s_v = internal global i32 undef,
 __constant__ int c_v;
-// DEVICE: addrspace(4) externally_initialized global i32 0,
+// DEVICE: addrspace(4) externally_initialized constant i32 0,
 // HOST:   @c_v = internal global i32 undef,
 
 __device__ int d_v_i = 1;
@@ -51,14 +51,14 @@ __shared__ T s_t;
 // DEVICE: @s_t ={{.*}} addrspace(3) global %struct.T undef,
 // HOST:   @s_t = internal global %struct.T undef,
 __constant__ T c_t;
-// DEVICE: @c_t ={{.*}} addrspace(4) externally_initialized global %struct.T zeroinitializer,
+// DEVICE: @c_t ={{.*}} addrspace(4) externally_initialized constant %struct.T zeroinitializer,
 // HOST:   @c_t = internal global %struct.T undef,
 
 __device__ T d_t_i = {2};
 // DEVICE: @d_t_i ={{.*}} addrspace(1) externally_initialized global %struct.T { i32 2 },
 // HOST:   @d_t_i = internal global %struct.T undef,
 __constant__ T c_t_i = {2};
-// DEVICE: @c_t_i ={{.*}} addrspace(4) externally_initialized global %struct.T { i32 2 },
+// DEVICE: @c_t_i ={{.*}} addrspace(4) externally_initialized constant %struct.T { i32 2 },
 // HOST:   @c_t_i = internal global %struct.T undef,
 
 // empty constructor
@@ -69,7 +69,7 @@ __shared__ EC s_ec;
 // DEVICE: @s_ec ={{.*}} addrspace(3) global %struct.EC undef,
 // HOST:   @s_ec = internal global %struct.EC undef,
 __constant__ EC c_ec;
-// DEVICE: @c_ec ={{.*}} addrspace(4) externally_initialized global %struct.EC zeroinitializer,
+// DEVICE: @c_ec ={{.*}} addrspace(4) externally_initialized constant %struct.EC zeroinitializer,
 // HOST:   @c_ec = internal global %struct.EC undef
 
 // empty destructor
@@ -80,7 +80,7 @@ __shared__ ED s_ed;
 // DEVICE: @s_ed ={{.*}} addrspace(3) global %struct.ED undef,
 // HOST:   @s_ed = internal global %struct.ED undef,
 __constant__ ED c_ed;
-// DEVICE: @c_ed ={{.*}} addrspace(4) externally_initialized global %struct.ED zeroinitializer,
+// DEVICE: @c_ed ={{.*}} addrspace(4) externally_initialized constant %struct.ED zeroinitializer,
 // HOST:   @c_ed = internal global %struct.ED undef,
 
 __device__ ECD d_ecd;
@@ -90,7 +90,7 @@ __shared__ ECD s_ecd;
 // DEVICE: @s_ecd ={{.*}} addrspace(3) global %struct.ECD undef,
 // HOST:   @s_ecd = internal global %struct.ECD undef,
 __constant__ ECD c_ecd;
-// DEVICE: @c_ecd ={{.*}} addrspace(4) externally_initialized global %struct.ECD zeroinitializer,
+// DEVICE: @c_ecd ={{.*}} addrspace(4) externally_initialized constant %struct.ECD zeroinitializer,
 // HOST:   @c_ecd = internal global %struct.ECD undef,
 
 // empty templated constructor -- allowed with no arguments
@@ -101,14 +101,14 @@ __shared__ ETC s_etc;
 // DEVICE: @s_etc ={{.*}} addrspace(3) global %struct.ETC undef,
 // HOST:   @s_etc = internal global %struct.ETC undef,
 __constant__ ETC c_etc;
-// DEVICE: @c_etc ={{.*}} addrspace(4) externally_initialized global %struct.ETC zeroinitializer,
+// DEVICE: @c_etc ={{.*}} addrspace(4) externally_initialized constant %struct.ETC zeroinitializer,
 // HOST:   @c_etc = internal global %struct.ETC undef,
 
 __device__ NCFS d_ncfs;
 // DEVICE: @d_ncfs ={{.*}} addrspace(1) externally_initialized global %struct.NCFS { i32 3 }
 // HOST:   @d_ncfs = internal global %struct.NCFS undef,
 __constant__ NCFS c_ncfs;
-// DEVICE: @c_ncfs ={{.*}} addrspace(4) externally_initialized global %struct.NCFS { i32 3 }
+// DEVICE: @c_ncfs ={{.*}} addrspace(4) externally_initialized constant %struct.NCFS { i32 3 }
 // HOST:   @c_ncfs = internal global %struct.NCFS undef,
 
 // Regular base class -- allowed
@@ -119,7 +119,7 @@ __shared__ T_B_T s_t_b_t;
 // DEVICE: @s_t_b_t ={{.*}} addrspace(3) global %struct.T_B_T undef,
 // HOST:   @s_t_b_t = internal global %struct.T_B_T undef,
 __constant__ T_B_T c_t_b_t;
-// DEVICE: @c_t_b_t ={{.*}} addrspace(4) externally_initialized global %struct.T_B_T zeroinitializer,
+// DEVICE: @c_t_b_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_B_T zeroinitializer,
 // HOST:   @c_t_b_t = internal global %struct.T_B_T undef,
 
 // Incapsulated object of allowed class -- allowed
@@ -130,7 +130,7 @@ __shared__ T_F_T s_t_f_t;
 // DEVICE: @s_t_f_t ={{.*}} addrspace(3) global %struct.T_F_T undef,
 // HOST:   @s_t_f_t = internal global %struct.T_F_T undef,
 __constant__ T_F_T c_t_f_t;
-// DEVICE: @c_t_f_t ={{.*}} addrspace(4) externally_initialized global %struct.T_F_T zeroinitializer,
+// DEVICE: @c_t_f_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_F_T zeroinitializer,
 // HOST:   @c_t_f_t = internal global %struct.T_F_T undef,
 
 // array of allowed objects -- allowed
@@ -141,7 +141,7 @@ __shared__ T_FA_T s_t_fa_t;
 // DEVICE: @s_t_fa_t ={{.*}} addrspace(3) global %struct.T_FA_T undef,
 // HOST:   @s_t_fa_t = internal global %struct.T_FA_T undef,
 __constant__ T_FA_T c_t_fa_t;
-// DEVICE: @c_t_fa_t ={{.*}} addrspace(4) externally_initialized global %struct.T_FA_T zeroinitializer,
+// DEVICE: @c_t_fa_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_FA_T zeroinitializer,
 // HOST:   @c_t_fa_t = internal global %struct.T_FA_T undef,
 
 
@@ -153,7 +153,7 @@ __shared__ EC_I_EC s_ec_i_ec;
 // DEVICE: @s_ec_i_ec ={{.*}} addrspace(3) global %struct.EC_I_EC undef,
 // HOST:   @s_ec_i_ec = internal global %struct.EC_I_EC undef,
 __constant__ EC_I_EC c_ec_i_ec;
-// DEVICE: @c_ec_i_ec ={{.*}} addrspace(4) externally_initialized global %struct.EC_I_EC zeroinitializer,
+// DEVICE: @c_ec_i_ec ={{.*}} addrspace(4) externally_initialized constant %struct.EC_I_EC zeroinitializer,
 // HOST:   @c_ec_i_ec = internal global %struct.EC_I_EC undef,
 
 // DEVICE: @_ZZ2dfvE4s_ec = internal addrspace(3) global %struct.EC undef
diff --git a/clang/test/CodeGenCUDA/device-var-linkage.cu b/clang/test/CodeGenCUDA/device-var-linkage.cu
index 3c2efb57525c9c..4c57323d85f9dd 100644
--- a/clang/test/CodeGenCUDA/device-var-linkage.cu
+++ b/clang/test/CodeGenCUDA/device-var-linkage.cu
@@ -20,7 +20,7 @@
 // NORDC-H-DAG: @v1 = internal global i32 undef
 // RDC-H-DAG: @v1 = global i32 undef
 __device__ int v1;
-// DEV-DAG: @v2 = addrspace(4) externally_initialized global i32 0
+// DEV-DAG: @v2 = addrspace(4) externally_initialized constant i32 0
 // NORDC-H-DAG: @v2 = internal global i32 undef
 // RDC-H-DAG: @v2 = global i32 undef
 __constant__ int v2;
@@ -48,10 +48,10 @@ extern __managed__ int ev3;
 // HOST-DAG: @_ZL3sv1 = internal global i32 undef
 // CUDA-DAG: @_ZL3sv1__static__[[HASH:.*]] = addrspace(1) externally_initialized global i32 0
 static __device__ int sv1;
-// NORDC-DAG: @_ZL3sv2 = addrspace(4) externally_initialized global i32 0
-// RDC-DAG: @_ZL3sv2.static.[[HASH]] = addrspace(4) externally_initialized global i32 0
+// NORDC-DAG: @_ZL3sv2 = addrspace(4) externally_initialized constant i32 0
+// RDC-DAG: @_ZL3sv2.static.[[HASH]] = addrspace(4) externally_initialized constant i32 0
 // HOST-DAG: @_ZL3sv2 = internal global i32 undef
-// CUDA-DAG: @_ZL3sv2__static__[[HASH]] = addrspace(4) externally_initialized global i32 0
+// CUDA-DAG: @_ZL3sv2__static__[[HASH]] = addrspace(4) externally_initialized constant i32 0
 static __constant__ int sv2;
 // NORDC-DAG: @_ZL3sv3 = addrspace(1) externally_initialized global ptr addrspace(1) null
 // RDC-DAG: @_ZL3sv3.static.[[HASH]] = addrspace(1) externally_initialized global ptr addrspace(1) null
diff --git a/clang/test/CodeGenCUDA/filter-decl.cu b/clang/test/CodeGenCUDA/filter-decl.cu
index 0f4691f7c8aa7c..02dacd0ad8ef41 100644
--- a/clang/test/CodeGenCUDA/filter-decl.cu
+++ b/clang/test/CodeGenCUDA/filter-decl.cu
@@ -10,7 +10,7 @@
 __asm__("file scope asm is host only");
 
 // CHECK-HOST: constantdata = internal global
-// CHECK-DEVICE: constantdata = {{(dso_local )?}}externally_initialized global
+// CHECK-DEVICE: constantdata = {{(dso_local )?}}externally_initialized constant
 __constant__ char constantdata[256];
 
 // CHECK-HOST: devicedata = internal global
diff --git a/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu b/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
index 80655c2d296047..e92b00345e00c2 100644
--- a/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
+++ b/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
@@ -50,7 +50,7 @@ static __device__ int x5;
 }
 
 // Check a static constant variable referenced by host is externalized.
-// DEV-DAG: @_ZL1y ={{.*}} addrspace(4) externally_initialized global i32 0
+// DEV-DAG: @_ZL1y ={{.*}} addrspace(4) externally_initialized constant i32 0
 // HOST-DAG: @_ZL1y = internal global i32 undef
 // HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y\00"
 
diff --git a/clang/test/CodeGenCUDA/static-device-var-rdc.cu b/clang/test/CodeGenCUDA/static-device-var-rdc.cu
index 16ec413397235a..9d2811f9385e1e 100644
--- a/clang/test/CodeGenCUDA/static-device-var-rdc.cu
+++ b/clang/test/CodeGenCUDA/static-device-var-rdc.cu
@@ -81,11 +81,11 @@ static __device__ int x;
 static __device__ int x2;
 
 // Test normal static device variables
-// INT-DEV-DAG: @_ZL1y[[FILEID:.*]] = addrspace(4) externally_initialized global i32 0
+// INT-DEV-DAG: @_ZL1y[[FILEID:.*]] = addrspace(4) externally_initialized constant i32 0
 // INT-HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y[[FILEID:.*]]\00"
 
 // Test externalized static device variables
-// EXT-DEV-DAG: @_ZL1y.static.[[HASH]] = addrspace(4) externally_initialized global i32 0
+// EXT-DEV-DAG: @_ZL1y.static.[[HASH]] = addrspace(4) externally_initialized constant i32 0
 // EXT-HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y.static.[[HASH]]\00"
 
 static __constant__ int y;
diff --git a/clang/test/CodeGenCUDA/template-class-static-member.cu b/clang/test/CodeGenCUDA/template-class-static-member.cu
index d790d2dea66bab..b614cd9dcbb14d 100644
--- a/clang/test/CodeGenCUDA/template-class-static-member.cu
+++ b/clang/test/CodeGenCUDA/template-class-static-member.cu
@@ -38,7 +38,7 @@ const int A<T>::const_member;
 template class A<int>;
 
 //DEV-DAG: @_ZN1AIiE8d_memberE = internal addrspace(1) global i32 0, comdat, align 4
-//DEV-DAG: @_ZN1AIiE8c_memberE = internal addrspace(4) global i32 0, comdat, align 4
+//DEV-DAG: @_ZN1AIiE8c_memberE = internal addrspace(4) constant i32 0, comdat, align 4
 //DEV-DAG: @_ZN1AIiE8m_memberE = internal addrspace(1) externally_initialized global ptr addrspace(1) null
 //DEV-DAG: @_ZN1AIiE12const_memberE = internal addrspace(4) constant i32 0, comdat, align 4
 //DEV-NEG-NOT: @_ZN1AIiE8h_memberE
diff --git a/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp b/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
index c575f49ff69716..05811bb7e1285d 100644
--- a/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
+++ b/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
@@ -12,7 +12,7 @@
 // CHECK: @d ={{.*}} addrspace(1) externally_initialized global
 __device__ int d;
 
-// CHECK: @c ={{.*}} addrspace(1) externally_initialized global
+// CHECK: @c ={{.*}} addrspace(1) externally_initialized constant
 __constant__ int c;
 
 // CHECK: @s ={{.*}} addrspace(3) global

@llvmbot
Copy link
Member

llvmbot commented Sep 26, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Alex Voicu (AlexVlx)

Changes

Currently, __constant__ variables do not get unconditionally marked as constant in IR, which seems a bit odd given their definition. This is generally inconsequential for NVPTX/AMDGPU, since said variables get emitted in the constant address space for those BEs. However, it is potentially significant for e.g. HIP-on-SPIR-V cases, as SPIR-V does not allow casts to/from the constant AS (UniformConstant), which forces __constant__ variables to be emitted in the global AS, thus making IR constness meaningful.


Full diff: https://github.com/llvm/llvm-project/pull/110182.diff

11 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+3-2)
  • (modified) clang/test/CodeGenCUDA/address-spaces.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/amdgpu-visibility.cu (+3-3)
  • (modified) clang/test/CodeGenCUDA/anon-ns.cu (+2-2)
  • (modified) clang/test/CodeGenCUDA/device-var-init.cu (+12-12)
  • (modified) clang/test/CodeGenCUDA/device-var-linkage.cu (+4-4)
  • (modified) clang/test/CodeGenCUDA/filter-decl.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/static-device-var-no-rdc.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/static-device-var-rdc.cu (+2-2)
  • (modified) clang/test/CodeGenCUDA/template-class-static-member.cu (+1-1)
  • (modified) clang/test/CodeGenHIP/hipspv-addr-spaces.cpp (+1-1)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 2381fa93e23fea..25c1c496a4f27f 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -5622,8 +5622,9 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D,
     emitter->finalize(GV);
 
   // If it is safe to mark the global 'constant', do so now.
-  GV->setConstant(!NeedsGlobalCtor && !NeedsGlobalDtor &&
-                  D->getType().isConstantStorage(getContext(), true, true));
+  GV->setConstant((D->hasAttr<CUDAConstantAttr>() && LangOpts.CUDAIsDevice) ||
+                  (!NeedsGlobalCtor && !NeedsGlobalDtor &&
+                   D->getType().isConstantStorage(getContext(), true, true)));
 
   // If it is in a read-only section, mark it 'constant'.
   if (const SectionAttr *SA = D->getAttr<SectionAttr>()) {
diff --git a/clang/test/CodeGenCUDA/address-spaces.cu b/clang/test/CodeGenCUDA/address-spaces.cu
index 0608c9cabd0489..66903c81b93339 100644
--- a/clang/test/CodeGenCUDA/address-spaces.cu
+++ b/clang/test/CodeGenCUDA/address-spaces.cu
@@ -9,7 +9,7 @@
 // CHECK: @i ={{.*}} addrspace(1) externally_initialized global
 __device__ int i;
 
-// CHECK: @j ={{.*}} addrspace(4) externally_initialized global
+// CHECK: @j ={{.*}} addrspace(4) externally_initialized constant
 __constant__ int j;
 
 // CHECK: @k ={{.*}} addrspace(3) global
diff --git a/clang/test/CodeGenCUDA/amdgpu-visibility.cu b/clang/test/CodeGenCUDA/amdgpu-visibility.cu
index d7dbab112a68c6..ef74d932ee8c8f 100644
--- a/clang/test/CodeGenCUDA/amdgpu-visibility.cu
+++ b/clang/test/CodeGenCUDA/amdgpu-visibility.cu
@@ -4,11 +4,11 @@
 
 #include "Inputs/cuda.h"
 
-// CHECK-DEFAULT: @c ={{.*}} addrspace(4) externally_initialized global
+// CHECK-DEFAULT: @c ={{.*}} addrspace(4) externally_initialized constant
 // CHECK-DEFAULT: @g ={{.*}} addrspace(1) externally_initialized global
-// CHECK-PROTECTED: @c = protected addrspace(4) externally_initialized global
+// CHECK-PROTECTED: @c = protected addrspace(4) externally_initialized constant
 // CHECK-PROTECTED: @g = protected addrspace(1) externally_initialized global
-// CHECK-HIDDEN: @c = protected addrspace(4) externally_initialized global
+// CHECK-HIDDEN: @c = protected addrspace(4) externally_initialized constant
 // CHECK-HIDDEN: @g = protected addrspace(1) externally_initialized global
 __constant__ int c;
 __device__ int g;
diff --git a/clang/test/CodeGenCUDA/anon-ns.cu b/clang/test/CodeGenCUDA/anon-ns.cu
index 3c55e9907dd6c1..d931f31d0207c5 100644
--- a/clang/test/CodeGenCUDA/anon-ns.cu
+++ b/clang/test/CodeGenCUDA/anon-ns.cu
@@ -28,13 +28,13 @@
 // HIP-DAG: define weak_odr {{.*}}void @[[KTX:_Z2ktIN12_GLOBAL__N_11XEEvT_\.intern\.b04fd23c98500190]](
 // HIP-DAG: define weak_odr {{.*}}void @[[KTL:_Z2ktIN12_GLOBAL__N_1UlvE_EEvT_\.intern\.b04fd23c98500190]](
 // HIP-DAG: @[[VM:_ZN12_GLOBAL__N_12vmE\.static\.b04fd23c98500190]] = addrspace(1) externally_initialized global
-// HIP-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE\.static\.b04fd23c98500190]] = addrspace(4) externally_initialized global
+// HIP-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE\.static\.b04fd23c98500190]] = addrspace(4) externally_initialized constant
 // HIP-DAG: @[[VT:_Z2vtIN12_GLOBAL__N_11XEE\.static\.b04fd23c98500190]] = addrspace(1) externally_initialized global
 
 // CUDA-DAG: define weak_odr {{.*}}void @[[KERN:_ZN12_GLOBAL__N_16kernelEv__intern__b04fd23c98500190]](
 // CUDA-DAG: define weak_odr {{.*}}void @[[KTX:_Z2ktIN12_GLOBAL__N_11XEEvT___intern__b04fd23c98500190]](
 // CUDA-DAG: define weak_odr {{.*}}void @[[KTL:_Z2ktIN12_GLOBAL__N_1UlvE_EEvT___intern__b04fd23c98500190]](
-// CUDA-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE__static__b04fd23c98500190]] = addrspace(4) externally_initialized global
+// CUDA-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE__static__b04fd23c98500190]] = addrspace(4) externally_initialized constant
 // CUDA-DAG: @[[VT:_Z2vtIN12_GLOBAL__N_11XEE__static__b04fd23c98500190]] = addrspace(1) externally_initialized global
 
 // COMMON-DAG: @_ZN12_GLOBAL__N_12vdE = internal addrspace(1) global
diff --git a/clang/test/CodeGenCUDA/device-var-init.cu b/clang/test/CodeGenCUDA/device-var-init.cu
index 226b7e295f4b45..9d62e4126b430d 100644
--- a/clang/test/CodeGenCUDA/device-var-init.cu
+++ b/clang/test/CodeGenCUDA/device-var-init.cu
@@ -26,7 +26,7 @@ __shared__ int s_v;
 // DEVICE: @s_v ={{.*}} addrspace(3) global i32 undef,
 // HOST:   @s_v = internal global i32 undef,
 __constant__ int c_v;
-// DEVICE: addrspace(4) externally_initialized global i32 0,
+// DEVICE: addrspace(4) externally_initialized constant i32 0,
 // HOST:   @c_v = internal global i32 undef,
 
 __device__ int d_v_i = 1;
@@ -51,14 +51,14 @@ __shared__ T s_t;
 // DEVICE: @s_t ={{.*}} addrspace(3) global %struct.T undef,
 // HOST:   @s_t = internal global %struct.T undef,
 __constant__ T c_t;
-// DEVICE: @c_t ={{.*}} addrspace(4) externally_initialized global %struct.T zeroinitializer,
+// DEVICE: @c_t ={{.*}} addrspace(4) externally_initialized constant %struct.T zeroinitializer,
 // HOST:   @c_t = internal global %struct.T undef,
 
 __device__ T d_t_i = {2};
 // DEVICE: @d_t_i ={{.*}} addrspace(1) externally_initialized global %struct.T { i32 2 },
 // HOST:   @d_t_i = internal global %struct.T undef,
 __constant__ T c_t_i = {2};
-// DEVICE: @c_t_i ={{.*}} addrspace(4) externally_initialized global %struct.T { i32 2 },
+// DEVICE: @c_t_i ={{.*}} addrspace(4) externally_initialized constant %struct.T { i32 2 },
 // HOST:   @c_t_i = internal global %struct.T undef,
 
 // empty constructor
@@ -69,7 +69,7 @@ __shared__ EC s_ec;
 // DEVICE: @s_ec ={{.*}} addrspace(3) global %struct.EC undef,
 // HOST:   @s_ec = internal global %struct.EC undef,
 __constant__ EC c_ec;
-// DEVICE: @c_ec ={{.*}} addrspace(4) externally_initialized global %struct.EC zeroinitializer,
+// DEVICE: @c_ec ={{.*}} addrspace(4) externally_initialized constant %struct.EC zeroinitializer,
 // HOST:   @c_ec = internal global %struct.EC undef
 
 // empty destructor
@@ -80,7 +80,7 @@ __shared__ ED s_ed;
 // DEVICE: @s_ed ={{.*}} addrspace(3) global %struct.ED undef,
 // HOST:   @s_ed = internal global %struct.ED undef,
 __constant__ ED c_ed;
-// DEVICE: @c_ed ={{.*}} addrspace(4) externally_initialized global %struct.ED zeroinitializer,
+// DEVICE: @c_ed ={{.*}} addrspace(4) externally_initialized constant %struct.ED zeroinitializer,
 // HOST:   @c_ed = internal global %struct.ED undef,
 
 __device__ ECD d_ecd;
@@ -90,7 +90,7 @@ __shared__ ECD s_ecd;
 // DEVICE: @s_ecd ={{.*}} addrspace(3) global %struct.ECD undef,
 // HOST:   @s_ecd = internal global %struct.ECD undef,
 __constant__ ECD c_ecd;
-// DEVICE: @c_ecd ={{.*}} addrspace(4) externally_initialized global %struct.ECD zeroinitializer,
+// DEVICE: @c_ecd ={{.*}} addrspace(4) externally_initialized constant %struct.ECD zeroinitializer,
 // HOST:   @c_ecd = internal global %struct.ECD undef,
 
 // empty templated constructor -- allowed with no arguments
@@ -101,14 +101,14 @@ __shared__ ETC s_etc;
 // DEVICE: @s_etc ={{.*}} addrspace(3) global %struct.ETC undef,
 // HOST:   @s_etc = internal global %struct.ETC undef,
 __constant__ ETC c_etc;
-// DEVICE: @c_etc ={{.*}} addrspace(4) externally_initialized global %struct.ETC zeroinitializer,
+// DEVICE: @c_etc ={{.*}} addrspace(4) externally_initialized constant %struct.ETC zeroinitializer,
 // HOST:   @c_etc = internal global %struct.ETC undef,
 
 __device__ NCFS d_ncfs;
 // DEVICE: @d_ncfs ={{.*}} addrspace(1) externally_initialized global %struct.NCFS { i32 3 }
 // HOST:   @d_ncfs = internal global %struct.NCFS undef,
 __constant__ NCFS c_ncfs;
-// DEVICE: @c_ncfs ={{.*}} addrspace(4) externally_initialized global %struct.NCFS { i32 3 }
+// DEVICE: @c_ncfs ={{.*}} addrspace(4) externally_initialized constant %struct.NCFS { i32 3 }
 // HOST:   @c_ncfs = internal global %struct.NCFS undef,
 
 // Regular base class -- allowed
@@ -119,7 +119,7 @@ __shared__ T_B_T s_t_b_t;
 // DEVICE: @s_t_b_t ={{.*}} addrspace(3) global %struct.T_B_T undef,
 // HOST:   @s_t_b_t = internal global %struct.T_B_T undef,
 __constant__ T_B_T c_t_b_t;
-// DEVICE: @c_t_b_t ={{.*}} addrspace(4) externally_initialized global %struct.T_B_T zeroinitializer,
+// DEVICE: @c_t_b_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_B_T zeroinitializer,
 // HOST:   @c_t_b_t = internal global %struct.T_B_T undef,
 
 // Incapsulated object of allowed class -- allowed
@@ -130,7 +130,7 @@ __shared__ T_F_T s_t_f_t;
 // DEVICE: @s_t_f_t ={{.*}} addrspace(3) global %struct.T_F_T undef,
 // HOST:   @s_t_f_t = internal global %struct.T_F_T undef,
 __constant__ T_F_T c_t_f_t;
-// DEVICE: @c_t_f_t ={{.*}} addrspace(4) externally_initialized global %struct.T_F_T zeroinitializer,
+// DEVICE: @c_t_f_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_F_T zeroinitializer,
 // HOST:   @c_t_f_t = internal global %struct.T_F_T undef,
 
 // array of allowed objects -- allowed
@@ -141,7 +141,7 @@ __shared__ T_FA_T s_t_fa_t;
 // DEVICE: @s_t_fa_t ={{.*}} addrspace(3) global %struct.T_FA_T undef,
 // HOST:   @s_t_fa_t = internal global %struct.T_FA_T undef,
 __constant__ T_FA_T c_t_fa_t;
-// DEVICE: @c_t_fa_t ={{.*}} addrspace(4) externally_initialized global %struct.T_FA_T zeroinitializer,
+// DEVICE: @c_t_fa_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_FA_T zeroinitializer,
 // HOST:   @c_t_fa_t = internal global %struct.T_FA_T undef,
 
 
@@ -153,7 +153,7 @@ __shared__ EC_I_EC s_ec_i_ec;
 // DEVICE: @s_ec_i_ec ={{.*}} addrspace(3) global %struct.EC_I_EC undef,
 // HOST:   @s_ec_i_ec = internal global %struct.EC_I_EC undef,
 __constant__ EC_I_EC c_ec_i_ec;
-// DEVICE: @c_ec_i_ec ={{.*}} addrspace(4) externally_initialized global %struct.EC_I_EC zeroinitializer,
+// DEVICE: @c_ec_i_ec ={{.*}} addrspace(4) externally_initialized constant %struct.EC_I_EC zeroinitializer,
 // HOST:   @c_ec_i_ec = internal global %struct.EC_I_EC undef,
 
 // DEVICE: @_ZZ2dfvE4s_ec = internal addrspace(3) global %struct.EC undef
diff --git a/clang/test/CodeGenCUDA/device-var-linkage.cu b/clang/test/CodeGenCUDA/device-var-linkage.cu
index 3c2efb57525c9c..4c57323d85f9dd 100644
--- a/clang/test/CodeGenCUDA/device-var-linkage.cu
+++ b/clang/test/CodeGenCUDA/device-var-linkage.cu
@@ -20,7 +20,7 @@
 // NORDC-H-DAG: @v1 = internal global i32 undef
 // RDC-H-DAG: @v1 = global i32 undef
 __device__ int v1;
-// DEV-DAG: @v2 = addrspace(4) externally_initialized global i32 0
+// DEV-DAG: @v2 = addrspace(4) externally_initialized constant i32 0
 // NORDC-H-DAG: @v2 = internal global i32 undef
 // RDC-H-DAG: @v2 = global i32 undef
 __constant__ int v2;
@@ -48,10 +48,10 @@ extern __managed__ int ev3;
 // HOST-DAG: @_ZL3sv1 = internal global i32 undef
 // CUDA-DAG: @_ZL3sv1__static__[[HASH:.*]] = addrspace(1) externally_initialized global i32 0
 static __device__ int sv1;
-// NORDC-DAG: @_ZL3sv2 = addrspace(4) externally_initialized global i32 0
-// RDC-DAG: @_ZL3sv2.static.[[HASH]] = addrspace(4) externally_initialized global i32 0
+// NORDC-DAG: @_ZL3sv2 = addrspace(4) externally_initialized constant i32 0
+// RDC-DAG: @_ZL3sv2.static.[[HASH]] = addrspace(4) externally_initialized constant i32 0
 // HOST-DAG: @_ZL3sv2 = internal global i32 undef
-// CUDA-DAG: @_ZL3sv2__static__[[HASH]] = addrspace(4) externally_initialized global i32 0
+// CUDA-DAG: @_ZL3sv2__static__[[HASH]] = addrspace(4) externally_initialized constant i32 0
 static __constant__ int sv2;
 // NORDC-DAG: @_ZL3sv3 = addrspace(1) externally_initialized global ptr addrspace(1) null
 // RDC-DAG: @_ZL3sv3.static.[[HASH]] = addrspace(1) externally_initialized global ptr addrspace(1) null
diff --git a/clang/test/CodeGenCUDA/filter-decl.cu b/clang/test/CodeGenCUDA/filter-decl.cu
index 0f4691f7c8aa7c..02dacd0ad8ef41 100644
--- a/clang/test/CodeGenCUDA/filter-decl.cu
+++ b/clang/test/CodeGenCUDA/filter-decl.cu
@@ -10,7 +10,7 @@
 __asm__("file scope asm is host only");
 
 // CHECK-HOST: constantdata = internal global
-// CHECK-DEVICE: constantdata = {{(dso_local )?}}externally_initialized global
+// CHECK-DEVICE: constantdata = {{(dso_local )?}}externally_initialized constant
 __constant__ char constantdata[256];
 
 // CHECK-HOST: devicedata = internal global
diff --git a/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu b/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
index 80655c2d296047..e92b00345e00c2 100644
--- a/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
+++ b/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
@@ -50,7 +50,7 @@ static __device__ int x5;
 }
 
 // Check a static constant variable referenced by host is externalized.
-// DEV-DAG: @_ZL1y ={{.*}} addrspace(4) externally_initialized global i32 0
+// DEV-DAG: @_ZL1y ={{.*}} addrspace(4) externally_initialized constant i32 0
 // HOST-DAG: @_ZL1y = internal global i32 undef
 // HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y\00"
 
diff --git a/clang/test/CodeGenCUDA/static-device-var-rdc.cu b/clang/test/CodeGenCUDA/static-device-var-rdc.cu
index 16ec413397235a..9d2811f9385e1e 100644
--- a/clang/test/CodeGenCUDA/static-device-var-rdc.cu
+++ b/clang/test/CodeGenCUDA/static-device-var-rdc.cu
@@ -81,11 +81,11 @@ static __device__ int x;
 static __device__ int x2;
 
 // Test normal static device variables
-// INT-DEV-DAG: @_ZL1y[[FILEID:.*]] = addrspace(4) externally_initialized global i32 0
+// INT-DEV-DAG: @_ZL1y[[FILEID:.*]] = addrspace(4) externally_initialized constant i32 0
 // INT-HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y[[FILEID:.*]]\00"
 
 // Test externalized static device variables
-// EXT-DEV-DAG: @_ZL1y.static.[[HASH]] = addrspace(4) externally_initialized global i32 0
+// EXT-DEV-DAG: @_ZL1y.static.[[HASH]] = addrspace(4) externally_initialized constant i32 0
 // EXT-HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y.static.[[HASH]]\00"
 
 static __constant__ int y;
diff --git a/clang/test/CodeGenCUDA/template-class-static-member.cu b/clang/test/CodeGenCUDA/template-class-static-member.cu
index d790d2dea66bab..b614cd9dcbb14d 100644
--- a/clang/test/CodeGenCUDA/template-class-static-member.cu
+++ b/clang/test/CodeGenCUDA/template-class-static-member.cu
@@ -38,7 +38,7 @@ const int A<T>::const_member;
 template class A<int>;
 
 //DEV-DAG: @_ZN1AIiE8d_memberE = internal addrspace(1) global i32 0, comdat, align 4
-//DEV-DAG: @_ZN1AIiE8c_memberE = internal addrspace(4) global i32 0, comdat, align 4
+//DEV-DAG: @_ZN1AIiE8c_memberE = internal addrspace(4) constant i32 0, comdat, align 4
 //DEV-DAG: @_ZN1AIiE8m_memberE = internal addrspace(1) externally_initialized global ptr addrspace(1) null
 //DEV-DAG: @_ZN1AIiE12const_memberE = internal addrspace(4) constant i32 0, comdat, align 4
 //DEV-NEG-NOT: @_ZN1AIiE8h_memberE
diff --git a/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp b/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
index c575f49ff69716..05811bb7e1285d 100644
--- a/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
+++ b/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
@@ -12,7 +12,7 @@
 // CHECK: @d ={{.*}} addrspace(1) externally_initialized global
 __device__ int d;
 
-// CHECK: @c ={{.*}} addrspace(1) externally_initialized global
+// CHECK: @c ={{.*}} addrspace(1) externally_initialized constant
 __constant__ int c;
 
 // CHECK: @s ={{.*}} addrspace(3) global

@llvmbot
Copy link
Member

llvmbot commented Sep 26, 2024

@llvm/pr-subscribers-clang-codegen

Author: Alex Voicu (AlexVlx)

Changes

Currently, __constant__ variables do not get unconditionally marked as constant in IR, which seems a bit odd given their definition. This is generally inconsequential for NVPTX/AMDGPU, since said variables get emitted in the constant address space for those BEs. However, it is potentially significant for e.g. HIP-on-SPIR-V cases, as SPIR-V does not allow casts to/from the constant AS (UniformConstant), which forces __constant__ variables to be emitted in the global AS, thus making IR constness meaningful.


Full diff: https://github.com/llvm/llvm-project/pull/110182.diff

11 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+3-2)
  • (modified) clang/test/CodeGenCUDA/address-spaces.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/amdgpu-visibility.cu (+3-3)
  • (modified) clang/test/CodeGenCUDA/anon-ns.cu (+2-2)
  • (modified) clang/test/CodeGenCUDA/device-var-init.cu (+12-12)
  • (modified) clang/test/CodeGenCUDA/device-var-linkage.cu (+4-4)
  • (modified) clang/test/CodeGenCUDA/filter-decl.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/static-device-var-no-rdc.cu (+1-1)
  • (modified) clang/test/CodeGenCUDA/static-device-var-rdc.cu (+2-2)
  • (modified) clang/test/CodeGenCUDA/template-class-static-member.cu (+1-1)
  • (modified) clang/test/CodeGenHIP/hipspv-addr-spaces.cpp (+1-1)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 2381fa93e23fea..25c1c496a4f27f 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -5622,8 +5622,9 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D,
     emitter->finalize(GV);
 
   // If it is safe to mark the global 'constant', do so now.
-  GV->setConstant(!NeedsGlobalCtor && !NeedsGlobalDtor &&
-                  D->getType().isConstantStorage(getContext(), true, true));
+  GV->setConstant((D->hasAttr<CUDAConstantAttr>() && LangOpts.CUDAIsDevice) ||
+                  (!NeedsGlobalCtor && !NeedsGlobalDtor &&
+                   D->getType().isConstantStorage(getContext(), true, true)));
 
   // If it is in a read-only section, mark it 'constant'.
   if (const SectionAttr *SA = D->getAttr<SectionAttr>()) {
diff --git a/clang/test/CodeGenCUDA/address-spaces.cu b/clang/test/CodeGenCUDA/address-spaces.cu
index 0608c9cabd0489..66903c81b93339 100644
--- a/clang/test/CodeGenCUDA/address-spaces.cu
+++ b/clang/test/CodeGenCUDA/address-spaces.cu
@@ -9,7 +9,7 @@
 // CHECK: @i ={{.*}} addrspace(1) externally_initialized global
 __device__ int i;
 
-// CHECK: @j ={{.*}} addrspace(4) externally_initialized global
+// CHECK: @j ={{.*}} addrspace(4) externally_initialized constant
 __constant__ int j;
 
 // CHECK: @k ={{.*}} addrspace(3) global
diff --git a/clang/test/CodeGenCUDA/amdgpu-visibility.cu b/clang/test/CodeGenCUDA/amdgpu-visibility.cu
index d7dbab112a68c6..ef74d932ee8c8f 100644
--- a/clang/test/CodeGenCUDA/amdgpu-visibility.cu
+++ b/clang/test/CodeGenCUDA/amdgpu-visibility.cu
@@ -4,11 +4,11 @@
 
 #include "Inputs/cuda.h"
 
-// CHECK-DEFAULT: @c ={{.*}} addrspace(4) externally_initialized global
+// CHECK-DEFAULT: @c ={{.*}} addrspace(4) externally_initialized constant
 // CHECK-DEFAULT: @g ={{.*}} addrspace(1) externally_initialized global
-// CHECK-PROTECTED: @c = protected addrspace(4) externally_initialized global
+// CHECK-PROTECTED: @c = protected addrspace(4) externally_initialized constant
 // CHECK-PROTECTED: @g = protected addrspace(1) externally_initialized global
-// CHECK-HIDDEN: @c = protected addrspace(4) externally_initialized global
+// CHECK-HIDDEN: @c = protected addrspace(4) externally_initialized constant
 // CHECK-HIDDEN: @g = protected addrspace(1) externally_initialized global
 __constant__ int c;
 __device__ int g;
diff --git a/clang/test/CodeGenCUDA/anon-ns.cu b/clang/test/CodeGenCUDA/anon-ns.cu
index 3c55e9907dd6c1..d931f31d0207c5 100644
--- a/clang/test/CodeGenCUDA/anon-ns.cu
+++ b/clang/test/CodeGenCUDA/anon-ns.cu
@@ -28,13 +28,13 @@
 // HIP-DAG: define weak_odr {{.*}}void @[[KTX:_Z2ktIN12_GLOBAL__N_11XEEvT_\.intern\.b04fd23c98500190]](
 // HIP-DAG: define weak_odr {{.*}}void @[[KTL:_Z2ktIN12_GLOBAL__N_1UlvE_EEvT_\.intern\.b04fd23c98500190]](
 // HIP-DAG: @[[VM:_ZN12_GLOBAL__N_12vmE\.static\.b04fd23c98500190]] = addrspace(1) externally_initialized global
-// HIP-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE\.static\.b04fd23c98500190]] = addrspace(4) externally_initialized global
+// HIP-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE\.static\.b04fd23c98500190]] = addrspace(4) externally_initialized constant
 // HIP-DAG: @[[VT:_Z2vtIN12_GLOBAL__N_11XEE\.static\.b04fd23c98500190]] = addrspace(1) externally_initialized global
 
 // CUDA-DAG: define weak_odr {{.*}}void @[[KERN:_ZN12_GLOBAL__N_16kernelEv__intern__b04fd23c98500190]](
 // CUDA-DAG: define weak_odr {{.*}}void @[[KTX:_Z2ktIN12_GLOBAL__N_11XEEvT___intern__b04fd23c98500190]](
 // CUDA-DAG: define weak_odr {{.*}}void @[[KTL:_Z2ktIN12_GLOBAL__N_1UlvE_EEvT___intern__b04fd23c98500190]](
-// CUDA-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE__static__b04fd23c98500190]] = addrspace(4) externally_initialized global
+// CUDA-DAG: @[[VC:_ZN12_GLOBAL__N_12vcE__static__b04fd23c98500190]] = addrspace(4) externally_initialized constant
 // CUDA-DAG: @[[VT:_Z2vtIN12_GLOBAL__N_11XEE__static__b04fd23c98500190]] = addrspace(1) externally_initialized global
 
 // COMMON-DAG: @_ZN12_GLOBAL__N_12vdE = internal addrspace(1) global
diff --git a/clang/test/CodeGenCUDA/device-var-init.cu b/clang/test/CodeGenCUDA/device-var-init.cu
index 226b7e295f4b45..9d62e4126b430d 100644
--- a/clang/test/CodeGenCUDA/device-var-init.cu
+++ b/clang/test/CodeGenCUDA/device-var-init.cu
@@ -26,7 +26,7 @@ __shared__ int s_v;
 // DEVICE: @s_v ={{.*}} addrspace(3) global i32 undef,
 // HOST:   @s_v = internal global i32 undef,
 __constant__ int c_v;
-// DEVICE: addrspace(4) externally_initialized global i32 0,
+// DEVICE: addrspace(4) externally_initialized constant i32 0,
 // HOST:   @c_v = internal global i32 undef,
 
 __device__ int d_v_i = 1;
@@ -51,14 +51,14 @@ __shared__ T s_t;
 // DEVICE: @s_t ={{.*}} addrspace(3) global %struct.T undef,
 // HOST:   @s_t = internal global %struct.T undef,
 __constant__ T c_t;
-// DEVICE: @c_t ={{.*}} addrspace(4) externally_initialized global %struct.T zeroinitializer,
+// DEVICE: @c_t ={{.*}} addrspace(4) externally_initialized constant %struct.T zeroinitializer,
 // HOST:   @c_t = internal global %struct.T undef,
 
 __device__ T d_t_i = {2};
 // DEVICE: @d_t_i ={{.*}} addrspace(1) externally_initialized global %struct.T { i32 2 },
 // HOST:   @d_t_i = internal global %struct.T undef,
 __constant__ T c_t_i = {2};
-// DEVICE: @c_t_i ={{.*}} addrspace(4) externally_initialized global %struct.T { i32 2 },
+// DEVICE: @c_t_i ={{.*}} addrspace(4) externally_initialized constant %struct.T { i32 2 },
 // HOST:   @c_t_i = internal global %struct.T undef,
 
 // empty constructor
@@ -69,7 +69,7 @@ __shared__ EC s_ec;
 // DEVICE: @s_ec ={{.*}} addrspace(3) global %struct.EC undef,
 // HOST:   @s_ec = internal global %struct.EC undef,
 __constant__ EC c_ec;
-// DEVICE: @c_ec ={{.*}} addrspace(4) externally_initialized global %struct.EC zeroinitializer,
+// DEVICE: @c_ec ={{.*}} addrspace(4) externally_initialized constant %struct.EC zeroinitializer,
 // HOST:   @c_ec = internal global %struct.EC undef
 
 // empty destructor
@@ -80,7 +80,7 @@ __shared__ ED s_ed;
 // DEVICE: @s_ed ={{.*}} addrspace(3) global %struct.ED undef,
 // HOST:   @s_ed = internal global %struct.ED undef,
 __constant__ ED c_ed;
-// DEVICE: @c_ed ={{.*}} addrspace(4) externally_initialized global %struct.ED zeroinitializer,
+// DEVICE: @c_ed ={{.*}} addrspace(4) externally_initialized constant %struct.ED zeroinitializer,
 // HOST:   @c_ed = internal global %struct.ED undef,
 
 __device__ ECD d_ecd;
@@ -90,7 +90,7 @@ __shared__ ECD s_ecd;
 // DEVICE: @s_ecd ={{.*}} addrspace(3) global %struct.ECD undef,
 // HOST:   @s_ecd = internal global %struct.ECD undef,
 __constant__ ECD c_ecd;
-// DEVICE: @c_ecd ={{.*}} addrspace(4) externally_initialized global %struct.ECD zeroinitializer,
+// DEVICE: @c_ecd ={{.*}} addrspace(4) externally_initialized constant %struct.ECD zeroinitializer,
 // HOST:   @c_ecd = internal global %struct.ECD undef,
 
 // empty templated constructor -- allowed with no arguments
@@ -101,14 +101,14 @@ __shared__ ETC s_etc;
 // DEVICE: @s_etc ={{.*}} addrspace(3) global %struct.ETC undef,
 // HOST:   @s_etc = internal global %struct.ETC undef,
 __constant__ ETC c_etc;
-// DEVICE: @c_etc ={{.*}} addrspace(4) externally_initialized global %struct.ETC zeroinitializer,
+// DEVICE: @c_etc ={{.*}} addrspace(4) externally_initialized constant %struct.ETC zeroinitializer,
 // HOST:   @c_etc = internal global %struct.ETC undef,
 
 __device__ NCFS d_ncfs;
 // DEVICE: @d_ncfs ={{.*}} addrspace(1) externally_initialized global %struct.NCFS { i32 3 }
 // HOST:   @d_ncfs = internal global %struct.NCFS undef,
 __constant__ NCFS c_ncfs;
-// DEVICE: @c_ncfs ={{.*}} addrspace(4) externally_initialized global %struct.NCFS { i32 3 }
+// DEVICE: @c_ncfs ={{.*}} addrspace(4) externally_initialized constant %struct.NCFS { i32 3 }
 // HOST:   @c_ncfs = internal global %struct.NCFS undef,
 
 // Regular base class -- allowed
@@ -119,7 +119,7 @@ __shared__ T_B_T s_t_b_t;
 // DEVICE: @s_t_b_t ={{.*}} addrspace(3) global %struct.T_B_T undef,
 // HOST:   @s_t_b_t = internal global %struct.T_B_T undef,
 __constant__ T_B_T c_t_b_t;
-// DEVICE: @c_t_b_t ={{.*}} addrspace(4) externally_initialized global %struct.T_B_T zeroinitializer,
+// DEVICE: @c_t_b_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_B_T zeroinitializer,
 // HOST:   @c_t_b_t = internal global %struct.T_B_T undef,
 
 // Incapsulated object of allowed class -- allowed
@@ -130,7 +130,7 @@ __shared__ T_F_T s_t_f_t;
 // DEVICE: @s_t_f_t ={{.*}} addrspace(3) global %struct.T_F_T undef,
 // HOST:   @s_t_f_t = internal global %struct.T_F_T undef,
 __constant__ T_F_T c_t_f_t;
-// DEVICE: @c_t_f_t ={{.*}} addrspace(4) externally_initialized global %struct.T_F_T zeroinitializer,
+// DEVICE: @c_t_f_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_F_T zeroinitializer,
 // HOST:   @c_t_f_t = internal global %struct.T_F_T undef,
 
 // array of allowed objects -- allowed
@@ -141,7 +141,7 @@ __shared__ T_FA_T s_t_fa_t;
 // DEVICE: @s_t_fa_t ={{.*}} addrspace(3) global %struct.T_FA_T undef,
 // HOST:   @s_t_fa_t = internal global %struct.T_FA_T undef,
 __constant__ T_FA_T c_t_fa_t;
-// DEVICE: @c_t_fa_t ={{.*}} addrspace(4) externally_initialized global %struct.T_FA_T zeroinitializer,
+// DEVICE: @c_t_fa_t ={{.*}} addrspace(4) externally_initialized constant %struct.T_FA_T zeroinitializer,
 // HOST:   @c_t_fa_t = internal global %struct.T_FA_T undef,
 
 
@@ -153,7 +153,7 @@ __shared__ EC_I_EC s_ec_i_ec;
 // DEVICE: @s_ec_i_ec ={{.*}} addrspace(3) global %struct.EC_I_EC undef,
 // HOST:   @s_ec_i_ec = internal global %struct.EC_I_EC undef,
 __constant__ EC_I_EC c_ec_i_ec;
-// DEVICE: @c_ec_i_ec ={{.*}} addrspace(4) externally_initialized global %struct.EC_I_EC zeroinitializer,
+// DEVICE: @c_ec_i_ec ={{.*}} addrspace(4) externally_initialized constant %struct.EC_I_EC zeroinitializer,
 // HOST:   @c_ec_i_ec = internal global %struct.EC_I_EC undef,
 
 // DEVICE: @_ZZ2dfvE4s_ec = internal addrspace(3) global %struct.EC undef
diff --git a/clang/test/CodeGenCUDA/device-var-linkage.cu b/clang/test/CodeGenCUDA/device-var-linkage.cu
index 3c2efb57525c9c..4c57323d85f9dd 100644
--- a/clang/test/CodeGenCUDA/device-var-linkage.cu
+++ b/clang/test/CodeGenCUDA/device-var-linkage.cu
@@ -20,7 +20,7 @@
 // NORDC-H-DAG: @v1 = internal global i32 undef
 // RDC-H-DAG: @v1 = global i32 undef
 __device__ int v1;
-// DEV-DAG: @v2 = addrspace(4) externally_initialized global i32 0
+// DEV-DAG: @v2 = addrspace(4) externally_initialized constant i32 0
 // NORDC-H-DAG: @v2 = internal global i32 undef
 // RDC-H-DAG: @v2 = global i32 undef
 __constant__ int v2;
@@ -48,10 +48,10 @@ extern __managed__ int ev3;
 // HOST-DAG: @_ZL3sv1 = internal global i32 undef
 // CUDA-DAG: @_ZL3sv1__static__[[HASH:.*]] = addrspace(1) externally_initialized global i32 0
 static __device__ int sv1;
-// NORDC-DAG: @_ZL3sv2 = addrspace(4) externally_initialized global i32 0
-// RDC-DAG: @_ZL3sv2.static.[[HASH]] = addrspace(4) externally_initialized global i32 0
+// NORDC-DAG: @_ZL3sv2 = addrspace(4) externally_initialized constant i32 0
+// RDC-DAG: @_ZL3sv2.static.[[HASH]] = addrspace(4) externally_initialized constant i32 0
 // HOST-DAG: @_ZL3sv2 = internal global i32 undef
-// CUDA-DAG: @_ZL3sv2__static__[[HASH]] = addrspace(4) externally_initialized global i32 0
+// CUDA-DAG: @_ZL3sv2__static__[[HASH]] = addrspace(4) externally_initialized constant i32 0
 static __constant__ int sv2;
 // NORDC-DAG: @_ZL3sv3 = addrspace(1) externally_initialized global ptr addrspace(1) null
 // RDC-DAG: @_ZL3sv3.static.[[HASH]] = addrspace(1) externally_initialized global ptr addrspace(1) null
diff --git a/clang/test/CodeGenCUDA/filter-decl.cu b/clang/test/CodeGenCUDA/filter-decl.cu
index 0f4691f7c8aa7c..02dacd0ad8ef41 100644
--- a/clang/test/CodeGenCUDA/filter-decl.cu
+++ b/clang/test/CodeGenCUDA/filter-decl.cu
@@ -10,7 +10,7 @@
 __asm__("file scope asm is host only");
 
 // CHECK-HOST: constantdata = internal global
-// CHECK-DEVICE: constantdata = {{(dso_local )?}}externally_initialized global
+// CHECK-DEVICE: constantdata = {{(dso_local )?}}externally_initialized constant
 __constant__ char constantdata[256];
 
 // CHECK-HOST: devicedata = internal global
diff --git a/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu b/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
index 80655c2d296047..e92b00345e00c2 100644
--- a/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
+++ b/clang/test/CodeGenCUDA/static-device-var-no-rdc.cu
@@ -50,7 +50,7 @@ static __device__ int x5;
 }
 
 // Check a static constant variable referenced by host is externalized.
-// DEV-DAG: @_ZL1y ={{.*}} addrspace(4) externally_initialized global i32 0
+// DEV-DAG: @_ZL1y ={{.*}} addrspace(4) externally_initialized constant i32 0
 // HOST-DAG: @_ZL1y = internal global i32 undef
 // HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y\00"
 
diff --git a/clang/test/CodeGenCUDA/static-device-var-rdc.cu b/clang/test/CodeGenCUDA/static-device-var-rdc.cu
index 16ec413397235a..9d2811f9385e1e 100644
--- a/clang/test/CodeGenCUDA/static-device-var-rdc.cu
+++ b/clang/test/CodeGenCUDA/static-device-var-rdc.cu
@@ -81,11 +81,11 @@ static __device__ int x;
 static __device__ int x2;
 
 // Test normal static device variables
-// INT-DEV-DAG: @_ZL1y[[FILEID:.*]] = addrspace(4) externally_initialized global i32 0
+// INT-DEV-DAG: @_ZL1y[[FILEID:.*]] = addrspace(4) externally_initialized constant i32 0
 // INT-HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y[[FILEID:.*]]\00"
 
 // Test externalized static device variables
-// EXT-DEV-DAG: @_ZL1y.static.[[HASH]] = addrspace(4) externally_initialized global i32 0
+// EXT-DEV-DAG: @_ZL1y.static.[[HASH]] = addrspace(4) externally_initialized constant i32 0
 // EXT-HOST-DAG: @[[DEVNAMEY:[0-9]+]] = {{.*}}c"_ZL1y.static.[[HASH]]\00"
 
 static __constant__ int y;
diff --git a/clang/test/CodeGenCUDA/template-class-static-member.cu b/clang/test/CodeGenCUDA/template-class-static-member.cu
index d790d2dea66bab..b614cd9dcbb14d 100644
--- a/clang/test/CodeGenCUDA/template-class-static-member.cu
+++ b/clang/test/CodeGenCUDA/template-class-static-member.cu
@@ -38,7 +38,7 @@ const int A<T>::const_member;
 template class A<int>;
 
 //DEV-DAG: @_ZN1AIiE8d_memberE = internal addrspace(1) global i32 0, comdat, align 4
-//DEV-DAG: @_ZN1AIiE8c_memberE = internal addrspace(4) global i32 0, comdat, align 4
+//DEV-DAG: @_ZN1AIiE8c_memberE = internal addrspace(4) constant i32 0, comdat, align 4
 //DEV-DAG: @_ZN1AIiE8m_memberE = internal addrspace(1) externally_initialized global ptr addrspace(1) null
 //DEV-DAG: @_ZN1AIiE12const_memberE = internal addrspace(4) constant i32 0, comdat, align 4
 //DEV-NEG-NOT: @_ZN1AIiE8h_memberE
diff --git a/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp b/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
index c575f49ff69716..05811bb7e1285d 100644
--- a/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
+++ b/clang/test/CodeGenHIP/hipspv-addr-spaces.cpp
@@ -12,7 +12,7 @@
 // CHECK: @d ={{.*}} addrspace(1) externally_initialized global
 __device__ int d;
 
-// CHECK: @c ={{.*}} addrspace(1) externally_initialized global
+// CHECK: @c ={{.*}} addrspace(1) externally_initialized constant
 __constant__ int c;
 
 // CHECK: @s ={{.*}} addrspace(3) global

@AlexVlx AlexVlx added cuda clang:codegen IR generation bugs: mangling, exceptions, etc. and removed clang Clang issues not falling into any other category backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. labels Sep 26, 2024
@AlexVlx AlexVlx requested a review from Artem-B September 26, 2024 22:32
@Artem-B
Copy link
Member

Artem-B commented Sep 26, 2024

__constant__ may not necessarily be const for IR purposes. I.e. IR may not rely on the 'known' values, as seen in IR, as the data may actually be populated by the host via CUDA API calls cudaMemcpyToSymbol before the GPU kernel launch.

@AlexVlx
Copy link
Contributor Author

AlexVlx commented Sep 26, 2024

__constant__ may not necessarily be const for IR purposes. I.e. IR may not rely on the 'known' values, as seen in IR, as the data may actually be populated by the host via CUDA API calls cudaMemcpyToSymbol before the GPU kernel launch.

But since this is marked externally_initialised on the device side, one would assume that those semantics still hold i.e. there's no known value to assume?

@AlexVlx
Copy link
Contributor Author

AlexVlx commented Sep 26, 2024

__constant__ may not necessarily be const for IR purposes. I.e. IR may not rely on the 'known' values, as seen in IR, as the data may actually be populated by the host via CUDA API calls cudaMemcpyToSymbol before the GPU kernel launch.

But since this is marked externally_initialised on the device side, one would assume that those semantics still hold i.e. there's no known value to assume?

Oh, I guess the concern might be around doing a cuda/hipMemcpyToSymbol during a kernel's execution (i.e. __global__ foo() is executing, host sets a constant, notifies foo, and expects foo to observe the updated constant)? I admit that I'm not certain if that is allowed, if it is then yes, this is problematic / not viable.

@Artem-B
Copy link
Member

Artem-B commented Sep 26, 2024

Well, it's certainly used that way in existing CUDA code and it's been around forever:
Here are few random examples:

@Artem-B
Copy link
Member

Artem-B commented Sep 27, 2024

I'm not 100% sure that externally_initialized is sufficient to deal with this use pattern.

IR manual says: https://llvm.org/docs/LangRef.html#global-variables

By default, global initializers are optimized by assuming that global variables defined within the module are not modified from their initial values before the start of the global initializer. This is true even for variables potentially accessible from outside the module, including those with external linkage or appearing in @llvm.used or dllexported variables. This assumption may be suppressed by marking the variable with externally_initialized.

It appears that externally_initialized is meant to suppress the assumption about the values only up to the point of the global initializer start. However, CUDA & NVIDIA GPUs do not have any global initializers, so it's either constant values or uninitialized. Considering that the values in the array may be changed by the host between kernel invocations, LLVM should never use the 'known' values in the global arrays, unless they explicitly made const by the user.

@AlexVlx
Copy link
Contributor Author

AlexVlx commented Sep 27, 2024

Well, it's certainly used that way in existing CUDA code and it's been around forever: Here are few random examples:

Hmm, neither of those illustrates what I was getting at / both would still work just the same even with this change, I think. Because the memcpyToSymbols would always be sequenced before the actual kernel dispatch expected to observe the modified values, and since these remain externally_initialized it should be fine (empirically, this appears to work as expected).

Also, turns out that the case I was considering (kernel executes concurrently with updates to a __constant__ variable it is accessing) is explicitly called out as UB by CUDA under 10.2.2: The behavior of modifying a constant from the host while there is a concurrent grid that access that constant at any point of this grid’s lifetime is undefined., so we're safe from that at least.

@Artem-B
Copy link
Member

Artem-B commented Sep 27, 2024

It has nothing to do with writing to those arrays while the kernel is running. That would indeed be UB.

both would still work just the same even with this change,

No, they will not. Here's the demonstration of the behavior change that const brings to the table:
https://cuda.godbolt.org/z/h6e5Wb4PT

It shows that adding const allows compiler to use the initializer values as written (it does not even bother instantiating cxxx), while without const it reads the array values, assuming that they may be changed externally.

@arsenm
Copy link
Contributor

arsenm commented Sep 27, 2024

If it's not legal for it to be marked as constant, it's also not legal to use constant address space

@yxsamliu
Copy link
Collaborator

yxsamliu commented Sep 27, 2024

It has nothing to do with writing to those arrays while the kernel is running. That would indeed be UB.

both would still work just the same even with this change,

No, they will not. Here's the demonstration of the behavior change that const brings to the table: https://cuda.godbolt.org/z/h6e5Wb4PT

It shows that adding const allows compiler to use the initializer values as written (it does not even bother instantiating cxxx), while without const it reads the array values, assuming that they may be changed externally.

EarlyCSE pass propagates the constant value https://cuda.godbolt.org/z/Ta8rKjMTo

However,

@xxx = dso_local addrspace(4) externally_initialized global [3 x i32] [i32 1, i32 2, i32 3], align 4
@_ZL4cxxx = internal addrspace(4) constant [3 x i32] [i32 1, i32 2, i32 3], align 4

In this case, _ZL4cxxx does not have externally_initialized . If this patch does not remove externally_initialized, probably this constant propagation won't happen.

Maybe Alex can add a lit test to prove that.

@AlexVlx
Copy link
Contributor Author

AlexVlx commented Sep 27, 2024

@yxsamliu beat me to it. The problem here is that the example illustrates language level const on a variable with internal linkage (which is mandatory for __constant__ variables in non-RDC compilation AFAICT from 17.5.24.1) and we don't set externally_initialized on that, which matches C/C++ expected semantics, I believe (there's an oblique mention as to the reasoning, which is that the loader should be allowed to place these in .rodata if they materialise at all). Regarding the lit test, I can definitely add another one, but I will note we already have Transforms/GlobalOpt/externally-initialized.ll, which probably can be extended?

Please note that this is something that wouldn't work (as apparently expected) today either. The user can write the code @Artem-B wrote (put const on a __constant__, add an initialiser because otherwise things would be broken), and they'd get exactly the potentially unexpected behaviour. The very fact that the godbolt example exists points that out (the example also helpfully points out that NVCC has the same behaviour). Unfortunately the CUDA spec does not appear to talk about this (a const qualified __constant__ variable) directly, but working back from 17.5.23.2 we can conclude that they're fine as long as the host compiler is not VC++.

@AlexVlx AlexVlx changed the title [cuda][[HIP] __constant__ should imply constant [cuda][HIP] __constant__ should imply constant Sep 27, 2024
@Artem-B
Copy link
Member

Artem-B commented Sep 27, 2024

In this case, _ZL4cxxx does not have externally_initialized . If this patch does not remove externally_initialized, probably this constant propagation won't happen.

Indeed, unoptimized code shows that cxxx has no externally_initialized, only constant.

If we keep externally_initialized, it appears to preserve the array and the loads from it, so it should be fine: https://godbolt.org/z/YeWE3z79r

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU labels Sep 28, 2024
@AlexVlx
Copy link
Contributor Author

AlexVlx commented Sep 28, 2024

I've extende the test under Transforms/GlobalOpt/externally-initialized.ll to also cover constants / ensure they don't get CSEd.

@AlexVlx AlexVlx merged commit e203a67 into llvm:main Sep 29, 2024
6 of 8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 29, 2024

LLVM Buildbot has detected a new failure on builder openmp-s390x-linux running on systemz-1 while building clang,llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/2915

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libomp :: tasking/issue-94260-2.c' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 1
/home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp   -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic && /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 29, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-android running on sanitizer-buildbot-android while building clang,llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/2761

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
PASS: AddressSanitizer-aarch64-android :: TestCases/initialization-nobug.cpp (226 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/Posix/asan-sigbus.cpp (227 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/Linux/clone_test.cpp (228 of 1709)
PASS: UBSan-Standalone-aarch64 :: TestCases/Float/cast-overflow.cpp (229 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/global-overflow.cpp (230 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/stack-buffer-overflow.cpp (231 of 1709)
PASS: UBSan-Standalone-aarch64 :: TestCases/Misc/nonnull-arg.cpp (232 of 1709)
PASS: HWAddressSanitizer-aarch64 :: TestCases/mem-intrinsics.c (233 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/Posix/halt_on_error-torture.cpp (234 of 1709)
XFAIL: AddressSanitizer-aarch64-android :: TestCases/Posix/asan-symbolize-sanity-test.cpp (235 of 1709)
FAIL: HWAddressSanitizer-aarch64 :: TestCases/hwasan_symbolize_stack_overflow.cpp (236 of 1709)
******************** TEST 'HWAddressSanitizer-aarch64 :: TestCases/hwasan_symbolize_stack_overflow.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: rm -rf /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp; mkdir /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
+ rm -rf /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
+ mkdir /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
RUN: at line 2: /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/sanitizer_common/android_commands/android_compile.py  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/bin/clang   --target=aarch64-linux-android24 --sysroot=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot --gcc-toolchain=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64  -B/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64 -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta  -fuse-ld=lld  -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -Wl,--build-id -g /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp -o /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow
+ /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/sanitizer_common/android_commands/android_compile.py /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/bin/clang --target=aarch64-linux-android24 --sysroot=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot --gcc-toolchain=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64 -B/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64 -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -fuse-ld=lld -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -Wl,--build-id -g /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp -o /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow
RUN: at line 3: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 16 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER0
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 16
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER0
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so
RUN: at line 4: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 17 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 17
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so
RUN: at line 5: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so
RUN: at line 6: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -17 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE17
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -17
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE17
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so
Step 31 (run lit tests [aarch64/bluejay-userdebug/TQ3A.230805.001]) failure: run lit tests [aarch64/bluejay-userdebug/TQ3A.230805.001] (failure)
...
PASS: AddressSanitizer-aarch64-android :: TestCases/initialization-nobug.cpp (226 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/Posix/asan-sigbus.cpp (227 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/Linux/clone_test.cpp (228 of 1709)
PASS: UBSan-Standalone-aarch64 :: TestCases/Float/cast-overflow.cpp (229 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/global-overflow.cpp (230 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/stack-buffer-overflow.cpp (231 of 1709)
PASS: UBSan-Standalone-aarch64 :: TestCases/Misc/nonnull-arg.cpp (232 of 1709)
PASS: HWAddressSanitizer-aarch64 :: TestCases/mem-intrinsics.c (233 of 1709)
PASS: AddressSanitizer-aarch64-android :: TestCases/Posix/halt_on_error-torture.cpp (234 of 1709)
XFAIL: AddressSanitizer-aarch64-android :: TestCases/Posix/asan-symbolize-sanity-test.cpp (235 of 1709)
FAIL: HWAddressSanitizer-aarch64 :: TestCases/hwasan_symbolize_stack_overflow.cpp (236 of 1709)
******************** TEST 'HWAddressSanitizer-aarch64 :: TestCases/hwasan_symbolize_stack_overflow.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: rm -rf /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp; mkdir /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
+ rm -rf /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
+ mkdir /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
RUN: at line 2: /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/sanitizer_common/android_commands/android_compile.py  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/bin/clang   --target=aarch64-linux-android24 --sysroot=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot --gcc-toolchain=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64  -B/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64 -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta  -fuse-ld=lld  -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -Wl,--build-id -g /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp -o /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow
+ /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/sanitizer_common/android_commands/android_compile.py /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/bin/clang --target=aarch64-linux-android24 --sysroot=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot --gcc-toolchain=/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64 -B/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/android_ndk/toolchains/llvm/prebuilt/linux-x86_64 -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -fuse-ld=lld -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -Wl,--build-id -g /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp -o /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow
RUN: at line 3: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 16 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER0
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 16
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER0
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so
RUN: at line 4: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 17 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 17
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so
RUN: at line 5: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so
RUN: at line 6: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not  /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -17 2>&1 | hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE17
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:abort_on_error=0:symbolize=0 not /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -17
+ hwasan_symbolize --symbols /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_aarch64/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE17
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/./bin/hwasan_symbolize:319: SyntaxWarning: invalid escape sequence '\['
  m = re.match(r'.*?(0x[0-9a-f]+):' + '([ ]*[\[ ][0-9a-f][0-9a-f]\]?)' * 16, line)
Could not find symbols for apex/com.android.runtime/lib64/bionic/libc.so

@AlexVlx AlexVlx deleted the constant_means_constant branch September 29, 2024 11:21
searlmc1 added a commit to ROCm/llvm-project that referenced this pull request Dec 5, 2024
Adds the following patches
AMDGPU: Remove wavefrontsize64 feature from dummy target llvm#117410
[LLVM][NFC] Use used's element type if available llvm#116804
[llvm][AMDGPU] Fold llvm.amdgcn.wavefrontsize early llvm#114481
[clang][Driver][HIP] Add support for mixing AMDGCNSPIRV & concrete offload-archs. llvm#113509
[clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V llvm#110695
[llvm][opt][Transforms] Replacement calloc should match replaced malloc llvm#110524
[clang][HIP] Don't use the OpenCLKernel CC when targeting AMDGCNSPIRV llvm#110447
[cuda][HIP] constant should imply constant llvm#110182
[llvm][SPIRV] Expose fast popcnt support for SPIR-V targets llvm#109845
[clang][CodeGen][SPIR-V] Fix incorrect SYCL usage, implement missing interface llvm#109415
[SPIRV][RFC] Rework / extend support for memory scopes llvm#106429
[clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. llvm#102776

Change-Id: I2b9ab54aba1c9345b9b0eb84409e6ed6c3cdb6cd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category cuda llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants