Skip to content

AMDGPU: Mark workitem ID intrinsics with range attribute #136196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Apr 17, 2025

This avoids the need to have special handling at every use site.
Unfortunately this means we unnecessarily emit AssertZext in the DAG
(where we already directly understand the range of the intrinsic), and
we regress in undefined cases as we don't fold out asserts on undef.

This avoids the need to have special handling at every use site.
Unfortunately this means we unnecessarily emit AssertZext in the DAG
(where we already directly understand the range of the intrinsic), andt
we regress in undefined cases as we don't fold out asserts on undef.
Copy link
Contributor Author

arsenm commented Apr 17, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@arsenm arsenm requested review from nikic and shiltian April 17, 2025 20:57
@arsenm arsenm marked this pull request as ready for review April 17, 2025 20:57
@llvmbot
Copy link
Member

llvmbot commented Apr 17, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

This avoids the need to have special handling at every use site.
Unfortunately this means we unnecessarily emit AssertZext in the DAG
(where we already directly understand the range of the intrinsic), andt
we regress in undefined cases as we don't fold out asserts on undef.


Patch is 69.40 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136196.diff

19 Files Affected:

  • (modified) clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp (-25)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+7-3)
  • (modified) clang/test/CodeGenOpenCL/builtins-r600.cl (+6-3)
  • (modified) clang/test/Headers/gpuintrin.c (+3-3)
  • (modified) clang/test/Headers/gpuintrin_lang.c (+6-6)
  • (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+31-16)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll (+14-19)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll (+63-52)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch.ll (+27-27)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.gfx11.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.gfx12.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.iterative.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.workitem.id-unsupported-calling-convention.ll (+16-5)
  • (modified) llvm/test/CodeGen/AMDGPU/memory_clause.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/v_add_u64_pseudo_sdwa.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-for-private.ll (+3-1)
diff --git a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
index 35c9f8ae48c80..ad012d98635ff 100644
--- a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
@@ -171,16 +171,6 @@ static Value *emitFPIntBuiltin(CodeGenFunction &CGF,
   return CGF.Builder.CreateCall(F, {Src0, Src1});
 }
 
-static Value *emitRangedBuiltin(CodeGenFunction &CGF, unsigned IntrinsicID,
-                                int low, int high) {
-  Function *F = CGF.CGM.getIntrinsic(IntrinsicID, {});
-  llvm::CallInst *Call = CGF.Builder.CreateCall(F);
-  llvm::ConstantRange CR(APInt(32, low), APInt(32, high));
-  Call->addRangeRetAttr(CR);
-  Call->addRetAttr(llvm::Attribute::AttrKind::NoUndef);
-  return Call;
-}
-
 // For processing memory ordering and memory scope arguments of various
 // amdgcn builtins.
 // \p Order takes a C++11 comptabile memory-ordering specifier and converts
@@ -934,15 +924,6 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
     Function *F = CGM.getIntrinsic(BuiltinWMMAOp, ArgTypes);
     return Builder.CreateCall(F, Args);
   }
-
-  // amdgcn workitem
-  case AMDGPU::BI__builtin_amdgcn_workitem_id_x:
-    return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_x, 0, 1024);
-  case AMDGPU::BI__builtin_amdgcn_workitem_id_y:
-    return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_y, 0, 1024);
-  case AMDGPU::BI__builtin_amdgcn_workitem_id_z:
-    return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_z, 0, 1024);
-
   // amdgcn workgroup size
   case AMDGPU::BI__builtin_amdgcn_workgroup_size_x:
     return EmitAMDGPUWorkGroupSize(*this, 0);
@@ -964,12 +945,6 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   case AMDGPU::BI__builtin_r600_recipsqrt_ieeef:
     return emitBuiltinWithOneOverloadedType<1>(*this, E,
                                                Intrinsic::r600_recipsqrt_ieee);
-  case AMDGPU::BI__builtin_r600_read_tidig_x:
-    return emitRangedBuiltin(*this, Intrinsic::r600_read_tidig_x, 0, 1024);
-  case AMDGPU::BI__builtin_r600_read_tidig_y:
-    return emitRangedBuiltin(*this, Intrinsic::r600_read_tidig_y, 0, 1024);
-  case AMDGPU::BI__builtin_r600_read_tidig_z:
-    return emitRangedBuiltin(*this, Intrinsic::r600_read_tidig_z, 0, 1024);
   case AMDGPU::BI__builtin_amdgcn_alignbit: {
     llvm::Value *Src0 = EmitScalarExpr(E->getArg(0));
     llvm::Value *Src1 = EmitScalarExpr(E->getArg(1));
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
index ded5f6b5ac4fd..bf022bc6eb446 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
@@ -605,9 +605,9 @@ void test_s_getreg(volatile global uint *out)
 }
 
 // CHECK-LABEL: @test_get_local_id(
-// CHECK: tail call noundef range(i32 0, 1024){{.*}} i32 @llvm.amdgcn.workitem.id.x()
-// CHECK: tail call noundef range(i32 0, 1024){{.*}} i32 @llvm.amdgcn.workitem.id.y()
-// CHECK: tail call noundef range(i32 0, 1024){{.*}} i32 @llvm.amdgcn.workitem.id.z()
+// CHECK: tail call{{.*}} i32 @llvm.amdgcn.workitem.id.x()
+// CHECK: tail call{{.*}} i32 @llvm.amdgcn.workitem.id.y()
+// CHECK: tail call{{.*}} i32 @llvm.amdgcn.workitem.id.z()
 void test_get_local_id(int d, global int *out)
 {
 	switch (d) {
@@ -618,6 +618,10 @@ void test_get_local_id(int d, global int *out)
 	}
 }
 
+// CHECK: declare noundef range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.x()
+// CHECK: declare noundef range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.y()
+// CHECK: declare noundef range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.z()
+
 // CHECK-LABEL: @test_get_workgroup_size(
 // CHECK: {{.*}}call align 8 dereferenceable(256){{.*}} ptr addrspace(4) @llvm.amdgcn.implicitarg.ptr()
 // CHECK: getelementptr inbounds nuw i8, ptr addrspace(4) %{{.*}}, i64 12
diff --git a/clang/test/CodeGenOpenCL/builtins-r600.cl b/clang/test/CodeGenOpenCL/builtins-r600.cl
index a82c4fb90ec50..5fe130f585688 100644
--- a/clang/test/CodeGenOpenCL/builtins-r600.cl
+++ b/clang/test/CodeGenOpenCL/builtins-r600.cl
@@ -39,9 +39,9 @@ void test_get_group_id(int d, global int *out)
 }
 
 // CHECK-LABEL: @test_get_local_id(
-// CHECK: tail call noundef range(i32 0, 1024) i32 @llvm.r600.read.tidig.x()
-// CHECK: tail call noundef range(i32 0, 1024) i32 @llvm.r600.read.tidig.y()
-// CHECK: tail call noundef range(i32 0, 1024) i32 @llvm.r600.read.tidig.z()
+// CHECK: tail call i32 @llvm.r600.read.tidig.x()
+// CHECK: tail call i32 @llvm.r600.read.tidig.y()
+// CHECK: tail call i32 @llvm.r600.read.tidig.z()
 void test_get_local_id(int d, global int *out)
 {
 	switch (d) {
@@ -52,3 +52,6 @@ void test_get_local_id(int d, global int *out)
 	}
 }
 
+// CHECK: declare noundef range(i32 0, 1024) i32 @llvm.r600.read.tidig.x()
+// CHECK: declare noundef range(i32 0, 1024) i32 @llvm.r600.read.tidig.y()
+// CHECK: declare noundef range(i32 0, 1024) i32 @llvm.r600.read.tidig.z()
diff --git a/clang/test/Headers/gpuintrin.c b/clang/test/Headers/gpuintrin.c
index 9a15ce277ba87..f7dfb86ac4652 100644
--- a/clang/test/Headers/gpuintrin.c
+++ b/clang/test/Headers/gpuintrin.c
@@ -291,7 +291,7 @@ __gpu_kernel void foo() {
 // AMDGPU-NEXT:  [[ENTRY:.*:]]
 // AMDGPU-NEXT:    [[RETVAL:%.*]] = alloca i32, align 4, addrspace(5)
 // AMDGPU-NEXT:    [[RETVAL_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL]] to ptr
-// AMDGPU-NEXT:    [[TMP0:%.*]] = call noundef range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.x()
+// AMDGPU-NEXT:    [[TMP0:%.*]] = call i32 @llvm.amdgcn.workitem.id.x()
 // AMDGPU-NEXT:    ret i32 [[TMP0]]
 //
 //
@@ -300,7 +300,7 @@ __gpu_kernel void foo() {
 // AMDGPU-NEXT:  [[ENTRY:.*:]]
 // AMDGPU-NEXT:    [[RETVAL:%.*]] = alloca i32, align 4, addrspace(5)
 // AMDGPU-NEXT:    [[RETVAL_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL]] to ptr
-// AMDGPU-NEXT:    [[TMP0:%.*]] = call noundef range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.y()
+// AMDGPU-NEXT:    [[TMP0:%.*]] = call i32 @llvm.amdgcn.workitem.id.y()
 // AMDGPU-NEXT:    ret i32 [[TMP0]]
 //
 //
@@ -309,7 +309,7 @@ __gpu_kernel void foo() {
 // AMDGPU-NEXT:  [[ENTRY:.*:]]
 // AMDGPU-NEXT:    [[RETVAL:%.*]] = alloca i32, align 4, addrspace(5)
 // AMDGPU-NEXT:    [[RETVAL_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL]] to ptr
-// AMDGPU-NEXT:    [[TMP0:%.*]] = call noundef range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.z()
+// AMDGPU-NEXT:    [[TMP0:%.*]] = call i32 @llvm.amdgcn.workitem.id.z()
 // AMDGPU-NEXT:    ret i32 [[TMP0]]
 //
 //
diff --git a/clang/test/Headers/gpuintrin_lang.c b/clang/test/Headers/gpuintrin_lang.c
index fa04849f8094d..ab660ac5c8a49 100644
--- a/clang/test/Headers/gpuintrin_lang.c
+++ b/clang/test/Headers/gpuintrin_lang.c
@@ -36,7 +36,7 @@ __device__ int foo() { return __gpu_thread_id_x(); }
 // CUDA-LABEL: define dso_local i32 @foo(
 // CUDA-SAME: ) #[[ATTR0:[0-9]+]] {
 // CUDA-NEXT:  [[ENTRY:.*:]]
-// CUDA-NEXT:    [[TMP0:%.*]] = call {{.*}}i32 @llvm.nvvm.read.ptx.sreg.tid.x()
+// CUDA-NEXT:    [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 // CUDA-NEXT:    ret i32 [[TMP0]]
 //
 // HIP-LABEL: define dso_local i32 @foo(
@@ -46,29 +46,29 @@ __device__ int foo() { return __gpu_thread_id_x(); }
 // HIP-NEXT:    [[RETVAL:%.*]] = alloca i32, align 4, addrspace(5)
 // HIP-NEXT:    [[RETVAL_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL]] to ptr
 // HIP-NEXT:    [[RETVAL_ASCAST_I:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL_I]] to ptr
-// HIP-NEXT:    [[TMP0:%.*]] = call noundef {{.*}}i32 @llvm.amdgcn.workitem.id.x()
+// HIP-NEXT:    [[TMP0:%.*]] = call i32 @llvm.amdgcn.workitem.id.x()
 // HIP-NEXT:    ret i32 [[TMP0]]
 //
 // OPENCL-LABEL: define dso_local i32 @foo(
 // OPENCL-SAME: ) #[[ATTR0:[0-9]+]] {
 // OPENCL-NEXT:  [[ENTRY:.*:]]
-// OPENCL-NEXT:    [[TMP0:%.*]] = call noundef {{.*}}i32 @llvm.amdgcn.workitem.id.x()
+// OPENCL-NEXT:    [[TMP0:%.*]] = call i32 @llvm.amdgcn.workitem.id.x()
 // OPENCL-NEXT:    ret i32 [[TMP0]]
 //
 // OPENMP-LABEL: define hidden i32 @foo(
 // OPENMP-SAME: ) #[[ATTR0:[0-9]+]] {
 // OPENMP-NEXT:  [[ENTRY:.*:]]
-// OPENMP-NEXT:    [[TMP0:%.*]] = call noundef {{.*}}i32 @llvm.amdgcn.workitem.id.x()
+// OPENMP-NEXT:    [[TMP0:%.*]] = call i32 @llvm.amdgcn.workitem.id.x()
 // OPENMP-NEXT:    ret i32 [[TMP0]]
 //
 // C89-LABEL: define dso_local i32 @foo(
-// C89-SAME: ) #[[ATTR2:[0-9]+]] {
+// C89-SAME: ) #[[ATTR0:[0-9]+]] {
 // C89-NEXT:  [[ENTRY:.*:]]
 // C89-NEXT:    [[RETVAL_I:%.*]] = alloca i32, align 4, addrspace(5)
 // C89-NEXT:    [[RETVAL:%.*]] = alloca i32, align 4, addrspace(5)
 // C89-NEXT:    [[RETVAL_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL]] to ptr
 // C89-NEXT:    [[RETVAL_ASCAST_I:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL_I]] to ptr
-// C89-NEXT:    [[TMP0:%.*]] = call noundef {{.*}}i32 @llvm.amdgcn.workitem.id.x()
+// C89-NEXT:    [[TMP0:%.*]] = call i32 @llvm.amdgcn.workitem.id.x()
 // C89-NEXT:    ret i32 [[TMP0]]
 //
 int foo() { return __gpu_thread_id_x(); }
diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index 217e43fcce4fd..75068717d9a5f 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -20,11 +20,16 @@ def local_ptr_ty : LLVMQualPointerType<3>;
 // some preloaded register from a function that is known to not need it is a violation
 // of the calling convention and also program-level UB. Outside of such IR-level UB,
 // these preloaded registers are always set to a well-defined value and are thus `noundef`.
-class AMDGPUReadPreloadRegisterIntrinsic
-  : DefaultAttrsIntrinsic<[llvm_i32_ty], [], [NoUndef<RetIndex>, IntrNoMem, IntrSpeculatable]>;
+class AMDGPUReadPreloadRegisterIntrinsic<
+    list<IntrinsicProperty> ExtraAttrs = []>
+    : DefaultAttrsIntrinsic<[llvm_i32_ty], [],
+                            !listconcat([NoUndef<RetIndex>, IntrNoMem,
+                                         IntrSpeculatable],
+                                        ExtraAttrs)>;
 
-class AMDGPUReadPreloadRegisterIntrinsicNamed<string name>
-  : DefaultAttrsIntrinsic<[llvm_i32_ty], [], [NoUndef<RetIndex>, IntrNoMem, IntrSpeculatable]>, ClangBuiltin<name>;
+class AMDGPUReadPreloadRegisterIntrinsicNamed<
+    string name, list<IntrinsicProperty> ExtraAttrs = []>
+    : AMDGPUReadPreloadRegisterIntrinsic<ExtraAttrs>, ClangBuiltin<name>;
 
 // Used to tag image and resource intrinsics with information used to generate
 // mem operands.
@@ -35,17 +40,22 @@ class AMDGPURsrcIntrinsic<int rsrcarg, bit isimage = false> {
 
 let TargetPrefix = "r600" in {
 
-multiclass AMDGPUReadPreloadRegisterIntrinsic_xyz {
-  def _x : AMDGPUReadPreloadRegisterIntrinsic;
-  def _y : AMDGPUReadPreloadRegisterIntrinsic;
-  def _z : AMDGPUReadPreloadRegisterIntrinsic;
-}
+  multiclass AMDGPUReadPreloadRegisterIntrinsic_xyz<
+      list<IntrinsicProperty> ExtraAttrs = []> {
+    def _x : AMDGPUReadPreloadRegisterIntrinsic<ExtraAttrs>;
+    def _y : AMDGPUReadPreloadRegisterIntrinsic<ExtraAttrs>;
+    def _z : AMDGPUReadPreloadRegisterIntrinsic<ExtraAttrs>;
+  }
 
-multiclass AMDGPUReadPreloadRegisterIntrinsic_xyz_named<string prefix> {
-  def _x : AMDGPUReadPreloadRegisterIntrinsicNamed<!strconcat(prefix, "_x")>;
-  def _y : AMDGPUReadPreloadRegisterIntrinsicNamed<!strconcat(prefix, "_y")>;
-  def _z : AMDGPUReadPreloadRegisterIntrinsicNamed<!strconcat(prefix, "_z")>;
-}
+  multiclass AMDGPUReadPreloadRegisterIntrinsic_xyz_named<
+      string prefix, list<IntrinsicProperty> ExtraAttrs = []> {
+    def _x : AMDGPUReadPreloadRegisterIntrinsicNamed<!strconcat(prefix, "_x"),
+                                                     ExtraAttrs>;
+    def _y : AMDGPUReadPreloadRegisterIntrinsicNamed<!strconcat(prefix, "_y"),
+                                                     ExtraAttrs>;
+    def _z : AMDGPUReadPreloadRegisterIntrinsicNamed<!strconcat(prefix, "_z"),
+                                                     ExtraAttrs>;
+  }
 
 defm int_r600_read_global_size : AMDGPUReadPreloadRegisterIntrinsic_xyz_named
                                  <"__builtin_r600_read_global_size">;
@@ -55,7 +65,9 @@ defm int_r600_read_tgid : AMDGPUReadPreloadRegisterIntrinsic_xyz_named
                           <"__builtin_r600_read_tgid">;
 
 defm int_r600_read_local_size : AMDGPUReadPreloadRegisterIntrinsic_xyz;
-defm int_r600_read_tidig : AMDGPUReadPreloadRegisterIntrinsic_xyz;
+defm int_r600_read_tidig
+    : AMDGPUReadPreloadRegisterIntrinsic_xyz_named<
+          "__builtin_r600_read_tidig", [Range<RetIndex, 0, 1024>]>;
 
 def int_r600_group_barrier : ClangBuiltin<"__builtin_r600_group_barrier">,
   Intrinsic<[], [], [IntrConvergent, IntrWillReturn]>;
@@ -146,7 +158,10 @@ let TargetPrefix = "amdgcn" in {
 // ABI Special Intrinsics
 //===----------------------------------------------------------------------===//
 
-defm int_amdgcn_workitem_id : AMDGPUReadPreloadRegisterIntrinsic_xyz;
+defm int_amdgcn_workitem_id
+    : AMDGPUReadPreloadRegisterIntrinsic_xyz_named<
+          "__builtin_amdgcn_workitem_id", [Range<RetIndex, 0, 1024>]>;
+
 defm int_amdgcn_workgroup_id : AMDGPUReadPreloadRegisterIntrinsic_xyz_named
                                <"__builtin_amdgcn_workgroup_id">;
 
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll
index 2ac5c78d8cdb5..b563e03b6080f 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll
@@ -61,7 +61,7 @@ entry:
 
 ; GCN: define amdgpu_kernel void @test_inliner(
 ; GCN-INL1:     %c1 = tail call coldcc float @foo(
-; GCN-INLDEF:   %cmp.i = fcmp ogt float %tmp2, 0.000000e+00
+; GCN-INLDEF:   %cmp.i = fcmp ogt float %{{.+}}, 0.000000e+00
 ; GCN-MAXBBDEF: %div.i{{[0-9]*}} = fdiv float 1.000000e+00, %c
 ; GCN-MAXBBDEF: %div.i{{[0-9]*}} = fdiv float 2.000000e+00, %tmp1.i
 ; GCN-MAXBB1:   call coldcc void @foo_private_ptr
diff --git a/llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll b/llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll
index 7819da8b97e55..9cf9d81773037 100644
--- a/llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll
+++ b/llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll
@@ -258,46 +258,41 @@ define amdgpu_kernel void @add_x_shl_max_offset() #1 {
 define amdgpu_kernel void @add_x_shl_neg_to_sub_max_offset_alt() #1 {
 ; CI-LABEL: add_x_shl_neg_to_sub_max_offset_alt:
 ; CI:       ; %bb.0:
-; CI-NEXT:    v_lshlrev_b32_e32 v0, 2, v0
-; CI-NEXT:    v_xor_b32_e32 v0, 0xffff, v0
+; CI-NEXT:    v_mul_i32_i24_e32 v0, -4, v0
 ; CI-NEXT:    v_mov_b32_e32 v1, 13
 ; CI-NEXT:    s_mov_b32 m0, -1
-; CI-NEXT:    ds_write_b8 v0, v1
+; CI-NEXT:    ds_write_b8 v0, v1 offset:65535
 ; CI-NEXT:    s_endpgm
 ;
 ; GFX9-LABEL: add_x_shl_neg_to_sub_max_offset_alt:
 ; GFX9:       ; %bb.0:
-; GFX9-NEXT:    v_lshlrev_b32_e32 v0, 2, v0
-; GFX9-NEXT:    v_xor_b32_e32 v0, 0xffff, v0
+; GFX9-NEXT:    v_mul_i32_i24_e32 v0, -4, v0
 ; GFX9-NEXT:    v_mov_b32_e32 v1, 13
-; GFX9-NEXT:    ds_write_b8 v0, v1
+; GFX9-NEXT:    ds_write_b8 v0, v1 offset:65535
 ; GFX9-NEXT:    s_endpgm
 ;
 ; GFX10-LABEL: add_x_shl_neg_to_sub_max_offset_alt:
 ; GFX10:       ; %bb.0:
-; GFX10-NEXT:    v_lshlrev_b32_e32 v0, 2, v0
+; GFX10-NEXT:    v_mul_i32_i24_e32 v0, -4, v0
 ; GFX10-NEXT:    v_mov_b32_e32 v1, 13
-; GFX10-NEXT:    v_xor_b32_e32 v0, 0xffff, v0
-; GFX10-NEXT:    ds_write_b8 v0, v1
+; GFX10-NEXT:    ds_write_b8 v0, v1 offset:65535
 ; GFX10-NEXT:    s_endpgm
 ;
 ; GFX11-TRUE16-LABEL: add_x_shl_neg_to_sub_max_offset_alt:
 ; GFX11-TRUE16:       ; %bb.0:
 ; GFX11-TRUE16-NEXT:    v_and_b32_e32 v0, 0x3ff, v0
-; GFX11-TRUE16-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
-; GFX11-TRUE16-NEXT:    v_lshlrev_b32_e32 v0, 2, v0
-; GFX11-TRUE16-NEXT:    v_xor_b32_e32 v1, 0xffff, v0
+; GFX11-TRUE16-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-TRUE16-NEXT:    v_mul_i32_i24_e32 v1, -4, v0
 ; GFX11-TRUE16-NEXT:    v_mov_b16_e32 v0.l, 13
-; GFX11-TRUE16-NEXT:    ds_store_b8 v1, v0
+; GFX11-TRUE16-NEXT:    ds_store_b8 v1, v0 offset:65535
 ; GFX11-TRUE16-NEXT:    s_endpgm
 ;
 ; GFX11-FAKE16-LABEL: add_x_shl_neg_to_sub_max_offset_alt:
 ; GFX11-FAKE16:       ; %bb.0:
 ; GFX11-FAKE16-NEXT:    v_dual_mov_b32 v1, 13 :: v_dual_and_b32 v0, 0x3ff, v0
-; GFX11-FAKE16-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
-; GFX11-FAKE16-NEXT:    v_lshlrev_b32_e32 v0, 2, v0
-; GFX11-FAKE16-NEXT:    v_xor_b32_e32 v0, 0xffff, v0
-; GFX11-FAKE16-NEXT:    ds_store_b8 v0, v1
+; GFX11-FAKE16-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-FAKE16-NEXT:    v_mul_i32_i24_e32 v0, -4, v0
+; GFX11-FAKE16-NEXT:    ds_store_b8 v0, v1 offset:65535
 ; GFX11-FAKE16-NEXT:    s_endpgm
   %x.i = tail call i32 @llvm.amdgcn.workitem.id.x()
   %.neg = mul i32 %x.i, -4
@@ -447,9 +442,9 @@ define amdgpu_kernel void @add_x_shl_neg_to_sub_multi_use() #1 {
 ;
 ; GFX11-LABEL: add_x_shl_neg_to_sub_multi_use:
 ; GFX11:       ; %bb.0:
-; GFX11-NEXT:    v_dual_mov_b32 v1, 13 :: v_dual_lshlrev_b32 v0, 2, v0
+; GFX11-NEXT:    v_dual_mov_b32 v1, 13 :: v_dual_and_b32 v0, 0x3ff, v0
 ; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
-; GFX11-NEXT:    v_and_b32_e32 v0, 0xffc, v0
+; GFX11-NEXT:    v_lshlrev_b32_e32 v0, 2, v0
 ; GFX11-NEXT:    v_sub_nc_u32_e32 v0, 0, v0
 ; GFX11-NEXT:    ds_store_b32 v0, v1 offset:123
 ; GFX11-NEXT:    ds_store_b32 v0, v1 offset:456
diff --git a/llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll b/llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll
index 96d0e383761d1..690e5cc68747f 100644
--- a/llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll
+++ b/llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll
@@ -142,7 +142,7 @@ define amdgpu_kernel void @soff1_voff2(i32 %soff) {
 ; GFX942-SDAG-NEXT:    v_mov_b32_e32 v1, 1
 ; GFX942-SDAG-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX942-SDAG-NEXT:    v_mov_b32_e32 v2, s0
-; GFX942-SDAG-NEXT:    v_lshl_add_u32 v0, v0, 1, v2
+; GFX942-SDAG-NEXT:    v_mad_u32_u24 v0, v0, 2, v2
 ; GFX942-SDAG-NEXT:    v_add_u32_e32 v2, 1, v0
 ; GFX942-SDAG-NEXT:    v_add_u32_e32 v3, 2, v0
 ; GFX942-SDAG-NEXT:    scratch_store_byte v2, v1, off sc0 sc1
@@ -160,7 +160,7 @@ define amdgpu_kernel void @soff1_voff2(i32 %soff) {
 ; GFX942-GISEL:       ; %bb.0: ; %bb
 ; GFX942-GISEL-NEXT:    s_load_dword s0, s[4:5], 0x24
 ; GFX942-GISEL-NEXT:    v_and_b32_e32 v0, 0x3ff, v0
-; GFX942-GISEL-NEXT:    v_lshlrev_b32_e32 v0, 1, v0
+; GFX942-GISEL-NEXT:    v_mul_u32_u24_e32 v0, 2, v0
 ; GFX942-GISEL-NEXT:    v_mov_b32_e32 v1, 1
 ; GFX942-GISEL-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX942-GISEL-NEXT:    v_add_u32_e32 v0, s0, v0
@@ -181,13 +181,14 @@ define amdgpu_kernel void @soff1_voff2(i32 %soff) {
 ; GFX11-SDAG:       ; %bb.0: ; %bb
 ; GFX11-SDAG-NEXT:    s_load_b32 s0, s[4:5], 0x24
 ; GFX11-SDAG-NEXT:    v_dual_mov_b32 v1, 1 :: v_dual_and_b32 v0, 0x3ff, v0
-; GFX11-SDAG-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(SKIP_1) | instid1(VALU_DEP_1)
-; GFX11-SDAG-NEXT:    v_dual_mov_b32 v3, 4 :: v_dual_lshlrev_b32 v0, 1, v0
+; GFX11-SDAG-NEXT:    v_dual_mov_b32 v2, 2 :: v_dual_mov_b32 v3, 4
+; GFX11-SDAG-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(SKIP_1) | instid1(VALU_DEP_1)
+; GFX11-SDAG-NEXT:    v_mul_u32_u24_e32 v0, 2, v0
 ; GFX11-SDAG-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX11-SDAG-NEXT:    v_add3_u32 v0, 0, s0, v0
 ; GFX11-SDAG-NEXT:    s_delay_alu instid0(VALU_DEP_1)
-; GFX11-SDAG-NEXT:    v_dual_mov_b32 v2, 2 :: v_dual_add_nc_u32 v5, 2, v0
 ; GFX11-SDAG-NEXT:    v_add_nc_u32_e32 v4, 1, v0
+; GFX11-SDAG-NEXT:    v_add_nc_u32_e32 v5, 2, v0
 ; GFX11-SDAG-NEXT:    v_add_nc_u32_e32 v0, 4, v0
 ; GFX11-SDAG-NEXT:    scratch_store_b8 v4, v1, off dlc
 ; GFX11-SDAG-NEXT:    s_waitcnt_vscnt null, 0x0
@@ -201,8 +202,9 @@ define amdgpu_kernel void @soff1_voff2(i32 %soff) {
 ; GFX11-GISEL:       ; %bb.0: ; %bb
 ; GFX11-GISEL-NEXT:    s_load_b32 s0, s[4:5], 0x24
 ; GFX11-GISEL-NEXT:    v_dual_mov_b32 v1, 1 :: v_dual_and_b32 v0, 0x3ff, v0
-; GFX11-GISEL-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(SKIP_1) | instid1(VALU_DEP_1)
-; GFX11-GISEL-NEXT:    v_dual_mov_b32 v3, 4 :: v_dual_lshlrev_b32 v0, 1, v0
+; GFX11-GISEL-NEXT:    v_mov_b32_e32 v3, 4
+; GFX11-GISEL-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(SKIP_1) | instid1(VALU_DEP_1)
+; GFX11-GISEL-NEXT:    v_mul_u32_u24_e32 v0, 2, v0
 ; GFX11-GISEL-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX11-GISEL-NEXT:    v_add_nc_u32_e32 v0, s0, v0
 ; GFX11-GISEL-NEXT:    s_...
[truncated]

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:ir llvm:transforms labels Apr 17, 2025
Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp clang/test/Headers/gpuintrin.c clang/test/Headers/gpuintrin_lang.c llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll llvm/test/CodeGen/AMDGPU/flat-scratch.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.gfx11.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.gfx12.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.iterative.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.workitem.id-unsupported-calling-convention.ll llvm/test/CodeGen/AMDGPU/memory_clause.ll llvm/test/CodeGen/AMDGPU/v_add_u64_pseudo_sdwa.ll llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-for-private.ll

The following files introduce new uses of undef:

  • llvm/test/CodeGen/AMDGPU/llvm.amdgcn.workitem.id-unsupported-calling-convention.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@nikic
Copy link
Contributor

nikic commented Apr 17, 2025

This avoids the need to have special handling at every use site. Unfortunately this means we unnecessarily emit AssertZext in the DAG (where we already directly understand the range of the intrinsic), andt we regress in undefined cases as we don't fold out asserts on undef.

This is not a "real" regression though, right? As in, we'd already get that behavior for the IR clang actually generates (with range attributes on the call-sites): https://llvm.godbolt.org/z/GWcs9837a It's just that the codegen tests weren't actually specifying the attribute that clang produces?

@arsenm
Copy link
Contributor Author

arsenm commented Apr 18, 2025

This is not a "real" regression though, right? As in, we'd already get that behavior for the IR clang actually generates (with range attributes on the call-sites): https://llvm.godbolt.org/z/GWcs9837a It's just that the codegen tests weren't actually specifying the attribute that clang produces?

Yes, for the clang case. I don't think other frontends have been bothering to set this

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor Author

arsenm commented Apr 18, 2025

Merge activity

  • Apr 18, 6:25 AM EDT: A user started a stack merge that includes this pull request via Graphite.
  • Apr 18, 6:27 AM EDT: A user merged this pull request with Graphite.

@arsenm arsenm merged commit 9bdd9dc into main Apr 18, 2025
18 of 19 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/mark-workitem-id-intrinsics-range-attribute branch April 18, 2025 10:27
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-ubuntu-fast running on sie-linux-worker while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/23079

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/clang -cc1 -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/lib/clang/21/include -nostdsysteminc -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/Inputs/include    -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -  | /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/clang -cc1 -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/lib/clang/21/include -nostdsysteminc -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/Inputs/include -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
�[1m/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c:39:15: �[0m�[0;1;31merror: �[0m�[1mCUDA-NEXT: expected string not found in input
�[0m// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
�[0;1;32m              ^
�[0m�[1m<stdin>:8:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m<stdin>:9:2: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
�[0;1;32m ^
�[0m
Input file: <stdin>
Check file: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m            1: �[0m�[1m�[0;1;46m; ModuleID = '/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c' �[0m
�[0;1;30m            2: �[0m�[1m�[0;1;46msource_filename = "/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/Headers/gpuintrin_lang.c" �[0m
�[0;1;30m            3: �[0m�[1m�[0;1;46mtarget datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" �[0m
�[0;1;30m            4: �[0m�[1m�[0;1;46mtarget triple = "nvptx64" �[0m
�[0;1;30m            5: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m            6: �[0m�[1m�[0;1;46m; Function Attrs: convergent noinline nounwind optnone �[0m
�[0;1;30m            7: �[0m�[1m�[0;1;46m�[0mdefine dso_local i32 @foo() #0 {�[0;1;46m �[0m
�[0;1;32mlabel:36'0     ^~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32mlabel:36'1     ^~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32msame:37'0                                ^~~~~~
�[0m�[0;1;32msame:37'1                                   ^    captured var "ATTR0"
�[0m�[0;1;30m            8: �[0m�[1m�[0;1;46m�[0mentry:�[0;1;46m �[0m
�[0;1;32mnext:38'0      ^~~~~~
�[0m�[0;1;32mnext:38'1      ^~~~~~  captured var "ENTRY"
�[0m�[0;1;31mnext:39'0            X error: no match found
�[0m�[0;1;30m            9: �[0m�[1m�[0;1;46m %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() �[0m
�[0;1;31mnext:39'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;35mnext:39'1       ?                                               possible intended match
�[0m�[0;1;30m           10: �[0m�[1m�[0;1;46m ret i32 %0 �[0m
�[0;1;31mnext:39'0      ~~~~~~~~~~~~
�[0m�[0;1;30m           11: �[0m�[1m�[0;1;46m} �[0m
�[0;1;31mnext:39'0      ~~
�[0m�[0;1;30m           12: �[0m�[1m�[0;1;46m �[0m
�[0;1;31mnext:39'0      ~
�[0m�[0;1;30m           13: �[0m�[1m�[0;1;46m; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) �[0m
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-sles-build-only running on rocm-worker-hw-04-sles while building clang,llvm at step 6 "Add check check-clang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/140/builds/21497

Here is the relevant piece of the build log for the reference
Step 6 (Add check check-clang) failure: test (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/clang -cc1 -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/lib/clang/21/include -nostdsysteminc -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/Inputs/include    -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c -o -  | /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/clang -cc1 -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/lib/clang/21/include -nostdsysteminc -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/Inputs/include -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c -o -
/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building clang,llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/15490

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/clang -cc1 -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/Inputs/include    -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c -o -  | /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/clang -cc1 -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/Inputs/include -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c -o -
+ /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder arc-builder running on arc-worker while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/14713

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/buildbot/worker/arc-folder/build/bin/clang -cc1 -internal-isystem /buildbot/worker/arc-folder/build/lib/clang/21/include -nostdsysteminc -internal-isystem /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/Inputs/include    -internal-isystem /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -  | /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /buildbot/worker/arc-folder/build/bin/clang -cc1 -internal-isystem /buildbot/worker/arc-folder/build/lib/clang/21/include -nostdsysteminc -internal-isystem /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/Inputs/include -internal-isystem /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -
+ /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
/buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/buildbot/worker/arc-folder/llvm-project/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime-2 running on rocm-worker-hw-02 while building clang,llvm at step 7 "Add check check-clang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/3722

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-clang) failure: test (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/bin/clang -cc1 -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/lib/clang/21/include -nostdsysteminc -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/Inputs/include    -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c -o -  | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/bin/clang -cc1 -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/lib/clang/21/include -nostdsysteminc -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/Inputs/include -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c -o -
+ /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder clang-m68k-linux-cross running on suse-gary-m68k-cross while building clang,llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/27/builds/8828

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/bin/clang -cc1 -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/Inputs/include    -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c -o -  | /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/bin/FileCheck /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/bin/clang -cc1 -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/Inputs/include -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c -o -
+ /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/bin/FileCheck /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder clang-cmake-x86_64-avx512-linux running on avx512-intel64 while building clang,llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/133/builds/14743

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/clang -cc1 -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/Inputs/include    -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c -o -  | /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/FileCheck /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/clang -cc1 -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/Inputs/include -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c -o -
+ /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/FileCheck /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building clang,llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/16257

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteAttach.py (1201 of 2125)
UNSUPPORTED: lldb-api :: tools/lldb-server/TestGdbRemoteFork.py (1202 of 2125)
UNSUPPORTED: lldb-api :: tools/lldb-server/TestGdbRemoteForkNonStop.py (1203 of 2125)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteCompletion.py (1204 of 2125)
UNSUPPORTED: lldb-api :: tools/lldb-server/TestGdbRemoteForkResume.py (1205 of 2125)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteExitCode.py (1206 of 2125)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteHostInfo.py (1207 of 2125)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteModuleInfo.py (1208 of 2125)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteAuxvSupport.py (1209 of 2125)
UNRESOLVED: lldb-api :: tools/lldb-dap/variables/TestDAP_variables.py (1210 of 2125)
******************** TEST 'lldb-api :: tools/lldb-dap/variables/TestDAP_variables.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/variables -p TestDAP_variables.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 9bdd9dc895ade41ec24f1a9918f70b23271ac89b)
  clang revision 9bdd9dc895ade41ec24f1a9918f70b23271ac89b
  llvm revision 9bdd9dc895ade41ec24f1a9918f70b23271ac89b
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_darwin_dwarf_missing_obj (TestDAP_variables.TestDAP_variables) (requires one of macosx, darwin, ios, tvos, watchos, bridgeos, iphonesimulator, watchsimulator, appletvsimulator) 
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_darwin_dwarf_missing_obj_with_symbol_ondemand_enabled (TestDAP_variables.TestDAP_variables) (requires one of macosx, darwin, ios, tvos, watchos, bridgeos, iphonesimulator, watchsimulator, appletvsimulator) 
========= DEBUG ADAPTER PROTOCOL LOGS =========
1744972815.397353411 --> (stdin/stdout) {"command":"initialize","type":"request","arguments":{"adapterID":"lldb-native","clientID":"vscode","columnsStartAt1":true,"linesStartAt1":true,"locale":"en-us","pathFormat":"path","supportsRunInTerminalRequest":true,"supportsVariablePaging":true,"supportsVariableType":true,"supportsStartDebuggingRequest":true,"supportsProgressReporting":true,"$__lldb_sourceInitFile":false},"seq":1}
1744972815.399456978 <-- (stdin/stdout) {"body":{"$__lldb_version":"lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 9bdd9dc895ade41ec24f1a9918f70b23271ac89b)\n  clang revision 9bdd9dc895ade41ec24f1a9918f70b23271ac89b\n  llvm revision 9bdd9dc895ade41ec24f1a9918f70b23271ac89b","completionTriggerCharacters":["."," ","\t"],"exceptionBreakpointFilters":[{"default":false,"filter":"cpp_catch","label":"C++ Catch"},{"default":false,"filter":"cpp_throw","label":"C++ Throw"},{"default":false,"filter":"objc_catch","label":"Objective-C Catch"},{"default":false,"filter":"objc_throw","label":"Objective-C Throw"}],"supportTerminateDebuggee":true,"supportsBreakpointLocationsRequest":true,"supportsCancelRequest":true,"supportsCompletionsRequest":true,"supportsConditionalBreakpoints":true,"supportsConfigurationDoneRequest":true,"supportsDataBreakpoints":true,"supportsDelayedStackTraceLoading":true,"supportsDisassembleRequest":true,"supportsEvaluateForHovers":true,"supportsExceptionInfoRequest":true,"supportsExceptionOptions":true,"supportsFunctionBreakpoints":true,"supportsHitConditionalBreakpoints":true,"supportsInstructionBreakpoints":true,"supportsLogPoints":true,"supportsModulesRequest":true,"supportsReadMemoryRequest":true,"supportsRestartRequest":true,"supportsSetVariable":true,"supportsStepInTargetsRequest":true,"supportsSteppingGranularity":true,"supportsValueFormattingOptions":true},"command":"initialize","request_seq":1,"seq":0,"success":true,"type":"response"}
1744972815.399687529 --> (stdin/stdout) {"command":"launch","type":"request","arguments":{"program":"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/variables/TestDAP_variables.test_indexedVariables/a.out","initCommands":["settings clear -all","settings set symbols.enable-external-lookup false","settings set target.inherit-tcc true","settings set target.disable-aslr false","settings set target.detach-on-error false","settings set target.auto-apply-fixits false","settings set plugin.process.gdb-remote.packet-timeout 60","settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"","settings set use-color false","settings set show-statusline false"],"disableASLR":false,"enableAutoVariableSummaries":false,"enableSyntheticChildDebugging":false,"displayExtendedBacktrace":false,"commandEscapePrefix":null},"seq":2}
1744972815.399891138 <-- (stdin/stdout) {"body":{"category":"console","output":"Running initCommands:\n"},"event":"output","seq":0,"type":"event"}
1744972815.399910688 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings clear -all\n"},"event":"output","seq":0,"type":"event"}
1744972815.399921656 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.enable-external-lookup false\n"},"event":"output","seq":0,"type":"event"}
1744972815.399930716 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.inherit-tcc true\n"},"event":"output","seq":0,"type":"event"}
1744972815.399938822 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.disable-aslr false\n"},"event":"output","seq":0,"type":"event"}
1744972815.399946690 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.detach-on-error false\n"},"event":"output","seq":0,"type":"event"}
1744972815.399954796 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.auto-apply-fixits false\n"},"event":"output","seq":0,"type":"event"}
1744972815.399962664 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set plugin.process.gdb-remote.packet-timeout 60\n"},"event":"output","seq":0,"type":"event"}
1744972815.399983168 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"\n"},"event":"output","seq":0,"type":"event"}
1744972815.399993420 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set use-color false\n"},"event":"output","seq":0,"type":"event"}
1744972815.400001287 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set show-statusline false\n"},"event":"output","seq":0,"type":"event"}
1744972815.476041317 <-- (stdin/stdout) {"command":"launch","request_seq":2,"seq":0,"success":true,"type":"response"}
1744972815.476104975 <-- (stdin/stdout) {"body":{"isLocalProcess":true,"name":"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/variables/TestDAP_variables.test_indexedVariables/a.out","startMethod":"launch","systemProcessId":1918303},"event":"process","seq":0,"type":"event"}
1744972815.476114988 <-- (stdin/stdout) {"event":"initialized","seq":0,"type":"event"}
1744972815.476419210 --> (stdin/stdout) {"command":"setBreakpoints","type":"request","arguments":{"source":{"name":"main.cpp","path":"main.cpp"},"sourceModified":false,"lines":[40],"breakpoints":[{"line":40}]},"seq":3}
1744972815.477877617 <-- (stdin/stdout) {"body":{"breakpoints":[{"column":1,"id":1,"instructionReference":"0xAAAAB5910C54","line":41,"source":{"name":"main.cpp","path":"main.cpp"},"verified":true}]},"command":"setBreakpoints","request_seq":3,"seq":0,"success":true,"type":"response"}

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-win running on sie-win-worker while building clang,llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/46/builds/15407

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe -cc1 -internal-isystem Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\21\include -nostdsysteminc -internal-isystem Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers/Inputs/include    -internal-isystem Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers/../../lib/Headers/cuda_wrappers    -internal-isystem Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers\gpuintrin_lang.c -o -  | z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers\gpuintrin_lang.c --check-prefix=CUDA
# executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe' -cc1 -internal-isystem 'Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\21\include' -nostdsysteminc -internal-isystem 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers/Inputs/include' -internal-isystem 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers/../../lib/Headers/cuda_wrappers' -internal-isystem 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers/../../lib/Headers/' -fcuda-is-device -triple nvptx64 -emit-llvm 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers\gpuintrin_lang.c' -o -
# executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe' 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers\gpuintrin_lang.c' --check-prefix=CUDA
# .---command stderr------------
# | �[1mZ:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers\gpuintrin_lang.c:39:15: �[0m�[0;1;31merror: �[0m�[1mCUDA-NEXT: expected string not found in input
�[0m# | �[1m�[0m// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
# | �[0;1;32m              ^
�[0m# | �[0;1;32m�[0m�[1m<stdin>:8:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0m# | �[1m�[0mentry:
# | �[0;1;32m      ^
�[0m# | �[0;1;32m�[0m�[1m<stdin>:9:2: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m# | �[1m�[0m %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
# | �[0;1;32m ^
�[0m# | �[0;1;32m�[0m
# | Input file: <stdin>
# | Check file: Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers\gpuintrin_lang.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# | �[1m�[0m�[0;1;30m            1: �[0m�[1m�[0;1;46m; ModuleID = 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Headers\gpuintrin_lang.c' �[0m
# | �[0;1;30m            2: �[0m�[1m�[0;1;46msource_filename = "Z:\\b\\llvm-clang-x86_64-sie-win\\llvm-project\\clang\\test\\Headers\\gpuintrin_lang.c" �[0m
# | �[0;1;30m            3: �[0m�[1m�[0;1;46mtarget datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" �[0m
# | �[0;1;30m            4: �[0m�[1m�[0;1;46mtarget triple = "nvptx64" �[0m
# | �[0;1;30m            5: �[0m�[1m�[0;1;46m �[0m
# | �[0;1;30m            6: �[0m�[1m�[0;1;46m; Function Attrs: convergent noinline nounwind optnone �[0m
# | �[0;1;30m            7: �[0m�[1m�[0;1;46m�[0mdefine dso_local i32 @foo() #0 {�[0;1;46m �[0m
# | �[0;1;32mlabel:36'0     ^~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;32mlabel:36'1     ^~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;32msame:37'0                                ^~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;32msame:37'1                                   ^    captured var "ATTR0"
�[0m# | �[0;1;32m�[0m�[0;1;30m            8: �[0m�[1m�[0;1;46m�[0mentry:�[0;1;46m �[0m
# | �[0;1;32mnext:38'0      ^~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;32mnext:38'1      ^~~~~~  captured var "ENTRY"
�[0m# | �[0;1;32m�[0m�[0;1;31mnext:39'0            X error: no match found
�[0m# | �[0;1;31m�[0m�[0;1;30m            9: �[0m�[1m�[0;1;46m %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() �[0m
# | �[0;1;31mnext:39'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;31m�[0m�[0;1;35mnext:39'1       ?                                               possible intended match
�[0m# | �[0;1;35m�[0m�[0;1;30m           10: �[0m�[1m�[0;1;46m ret i32 %0 �[0m
# | �[0;1;31mnext:39'0      ~~~~~~~~~~~~
�[0m# | �[0;1;31m�[0m�[0;1;30m           11: �[0m�[1m�[0;1;46m} �[0m
# | �[0;1;31mnext:39'0      ~~
�[0m# | �[0;1;31m�[0m�[0;1;30m           12: �[0m�[1m�[0;1;46m �[0m
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-5 while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/18560

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/21/include -nostdsysteminc -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/Inputs/include    -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -  | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/21/include -nostdsysteminc -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/Inputs/include -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c:39:15: �[0m�[0;1;31merror: �[0m�[1mCUDA-NEXT: expected string not found in input
�[0m// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
�[0;1;32m              ^
�[0m�[1m<stdin>:8:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m<stdin>:9:2: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
�[0;1;32m ^
�[0m
Input file: <stdin>
Check file: /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m            1: �[0m�[1m�[0;1;46m; ModuleID = '/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c' �[0m
�[0;1;30m            2: �[0m�[1m�[0;1;46msource_filename = "/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c" �[0m
�[0;1;30m            3: �[0m�[1m�[0;1;46mtarget datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" �[0m
�[0;1;30m            4: �[0m�[1m�[0;1;46mtarget triple = "nvptx64" �[0m
�[0;1;30m            5: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m            6: �[0m�[1m�[0;1;46m; Function Attrs: convergent noinline nounwind optnone �[0m
�[0;1;30m            7: �[0m�[1m�[0;1;46m�[0mdefine dso_local i32 @foo() #0 {�[0;1;46m �[0m
�[0;1;32mlabel:36'0     ^~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32mlabel:36'1     ^~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32msame:37'0                                ^~~~~~
�[0m�[0;1;32msame:37'1                                   ^    captured var "ATTR0"
�[0m�[0;1;30m            8: �[0m�[1m�[0;1;46m�[0mentry:�[0;1;46m �[0m
�[0;1;32mnext:38'0      ^~~~~~
�[0m�[0;1;32mnext:38'1      ^~~~~~  captured var "ENTRY"
�[0m�[0;1;31mnext:39'0            X error: no match found
�[0m�[0;1;30m            9: �[0m�[1m�[0;1;46m %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() �[0m
�[0;1;31mnext:39'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;35mnext:39'1       ?                                               possible intended match
�[0m�[0;1;30m           10: �[0m�[1m�[0;1;46m ret i32 %0 �[0m
�[0;1;31mnext:39'0      ~~~~~~~~~~~~
�[0m�[0;1;30m           11: �[0m�[1m�[0;1;46m} �[0m
�[0;1;31mnext:39'0      ~~
�[0m�[0;1;30m           12: �[0m�[1m�[0;1;46m �[0m
�[0;1;31mnext:39'0      ~
�[0m�[0;1;30m           13: �[0m�[1m�[0;1;46m; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) �[0m
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder clang-armv8-quick running on linaro-clang-armv8-quick while building clang,llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/154/builds/15010

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/tcwg-buildbot/worker/clang-armv8-quick/stage1/bin/clang -cc1 -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/Inputs/include    -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c -o -  | /home/tcwg-buildbot/worker/clang-armv8-quick/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /home/tcwg-buildbot/worker/clang-armv8-quick/stage1/bin/clang -cc1 -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/stage1/lib/clang/21/include -nostdsysteminc -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/Inputs/include -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c -o -
+ /home/tcwg-buildbot/worker/clang-armv8-quick/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
/home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/home/tcwg-buildbot/worker/clang-armv8-quick/llvm/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 18, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-darwin running on doug-worker-3 while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/23/builds/9504

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: Headers/gpuintrin_lang.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/Volumes/RAMDisk/buildbot-root/x86_64-darwin/build/bin/clang -cc1 -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/build/lib/clang/21/include -nostdsysteminc -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/Inputs/include    -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers    -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/    -fcuda-is-device -triple nvptx64 -emit-llvm /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -  | /Volumes/RAMDisk/buildbot-root/x86_64-darwin/build/bin/FileCheck /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA # RUN: at line 2
+ /Volumes/RAMDisk/buildbot-root/x86_64-darwin/build/bin/clang -cc1 -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/build/lib/clang/21/include -nostdsysteminc -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/Inputs/include -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/cuda_wrappers -internal-isystem /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/../../lib/Headers/ -fcuda-is-device -triple nvptx64 -emit-llvm /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c -o -
+ /Volumes/RAMDisk/buildbot-root/x86_64-darwin/build/bin/FileCheck /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c --check-prefix=CUDA
/Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c:39:15: error: CUDA-NEXT: expected string not found in input
// CUDA-NEXT: [[TMP0:%.*]] = call range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:2: note: possible intended match here
 %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
 ^

Input file: <stdin>
Check file: /Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '/Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c' 
           2: source_filename = "/Volumes/RAMDisk/buildbot-root/x86_64-darwin/llvm-project/clang/test/Headers/gpuintrin_lang.c" 
           3: target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64" 
           4: target triple = "nvptx64" 
           5:  
           6: ; Function Attrs: convergent noinline nounwind optnone 
           7: define dso_local i32 @foo() #0 { 
           8: entry: 
next:39'0           X error: no match found
           9:  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:39'1      ?                                               possible intended match
          10:  ret i32 %0 
next:39'0     ~~~~~~~~~~~~
          11: } 
next:39'0     ~~
          12:  
next:39'0     ~
          13: ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          14: declare noundef i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1 
next:39'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
...

@jurahul
Copy link
Contributor

jurahul commented Apr 18, 2025

I attempted to fix the failure in clang/test/Headers/gpuintrin_lang.c by removing the range() in the CHECK, but now some other build fails and complains that the range() is missing:

https://lab.llvm.org/buildbot/#/builders/174/builds/16421/steps/6/logs/FAIL__Clang__gpuintrin_lang_c

@arsenm can you PTAL?

@arsenm
Copy link
Contributor Author

arsenm commented Apr 18, 2025

I attempted to fix the failure in clang/test/Headers/gpuintrin_lang.c by removing the range() in the CHECK, but now some other build fails and complains that the range() is missing:

https://lab.llvm.org/buildbot/#/builders/174/builds/16421/steps/6/logs/FAIL__Clang__gpuintrin_lang_c

@arsenm can you PTAL?

Something is wrong with specific hosts, but I don't know why. The problem is the nvptx builtins don't actually annotate the callsites with the range, the NVVMIntrRange pass is adding them. For some reason these hosts don't run it; is this just what happens if you run the test without the NVPTX backend built?

@jurahul
Copy link
Contributor

jurahul commented Apr 18, 2025

I think you guess is right: this one for example complains when range() is present in the checks:

https://lab.llvm.org/buildbot/#/builders/65/builds/15490

and it only has AArch64 backend enabled.

@jurahul
Copy link
Contributor

jurahul commented Apr 18, 2025

So may be we can make the check more permissive for now?

@arsenm
Copy link
Contributor Author

arsenm commented Apr 18, 2025

So may be we can make the check more permissive for now?

Yes, option 1 is use regex on the return attribute. Option 2 is REQUIRES nvptx registered target. Option 3 is some other way of making sure NVVMIntrRange doesn't run. Something is broken now because it is running at -O0 but it should not be. Option 4 is remove NVVMIntrRange and move the conservative maximum ranges to the intrinsic declarations like this patch did

@jurahul
Copy link
Contributor

jurahul commented Apr 18, 2025

#136301 ?

Yeah, ultimately, we want the range on the intrinsic declaration.

IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This avoids the need to have special handling at every use site.
Unfortunately this means we unnecessarily emit AssertZext in the DAG
(where we already directly understand the range of the intrinsic), andt
we regress in undefined cases as we don't fold out asserts on undef.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This avoids the need to have special handling at every use site.
Unfortunately this means we unnecessarily emit AssertZext in the DAG
(where we already directly understand the range of the intrinsic), andt
we regress in undefined cases as we don't fold out asserts on undef.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category llvm:ir llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants