Skip to content

[OpenMP] Remove 'omp assumes' scopes now that we have no inline ASM #123611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 20, 2025

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jan 20, 2025

Summary:
We used this globally scoped ext_no_call_asm as a sort of hack around
the compiler that allowed the attributor to optimize out inline assembly
calls to PTX instructions. Quite some time ago I got rid of every inline
assembly call and replaced it with a builitin, so this can just be
deleted.

Furthermore, I use the [[omp::assume]] attribute directly for the
aligned barrier usage. This prints an unknown assumption warning (even
though it isn't) so I'm just silencing that for now until I fix it
later.

Summary:
We used this globally scoped `ext_no_call_asm` as a sort of hack around
the compiler that allowed the attributor to optimize out inline assembly
calls to PTX instructions. Quite some time ago I got rid of every inline
assembly call and replaced it with a builitin, so this can just be
deleted.

Furthermore, I use the `[[omp::assume]]` attribute directly for the
aligned barrier usage. This prints an unknown assumption warning (even
though it isn't) so I'm just silencing that for now until I fix it
later.
@llvmbot
Copy link
Member

llvmbot commented Jan 20, 2025

@llvm/pr-subscribers-offload

Author: Joseph Huber (jhuber6)

Changes

Summary:
We used this globally scoped ext_no_call_asm as a sort of hack around
the compiler that allowed the attributor to optimize out inline assembly
calls to PTX instructions. Quite some time ago I got rid of every inline
assembly call and replaced it with a builitin, so this can just be
deleted.

Furthermore, I use the [[omp::assume]] attribute directly for the
aligned barrier usage. This prints an unknown assumption warning (even
though it isn't) so I'm just silencing that for now until I fix it
later.


Full diff: https://github.com/llvm/llvm-project/pull/123611.diff

3 Files Affected:

  • (modified) offload/DeviceRTL/CMakeLists.txt (+1)
  • (modified) offload/DeviceRTL/include/DeviceTypes.h (-9)
  • (modified) offload/DeviceRTL/include/Synchronization.h (+2-3)
diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index 099634e211e7a7..3f647304b06f85 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -100,6 +100,7 @@ set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden
              -nocudalib -nogpulib -nogpuinc -nostdlibinc
              -fopenmp -fopenmp-cuda-mode
              -Wno-unknown-cuda-version -Wno-openmp-target
+             -Wno-unknown-assumption
              -DOMPTARGET_DEVICE_RUNTIME
              -I${include_directory}
              -I${devicertl_base_directory}/../include
diff --git a/offload/DeviceRTL/include/DeviceTypes.h b/offload/DeviceRTL/include/DeviceTypes.h
index 259bc008f91d13..1cd044f432e569 100644
--- a/offload/DeviceRTL/include/DeviceTypes.h
+++ b/offload/DeviceRTL/include/DeviceTypes.h
@@ -15,15 +15,6 @@
 #include <stddef.h>
 #include <stdint.h>
 
-// Tell the compiler that we do not have any "call-like" inline assembly in the
-// device rutime. That means we cannot have inline assembly which will call
-// another function but only inline assembly that performs some operation or
-// side-effect and then continues execution with something on the existing call
-// stack.
-//
-// TODO: Find a good place for this
-#pragma omp assumes ext_no_call_asm
-
 enum omp_proc_bind_t {
   omp_proc_bind_false = 0,
   omp_proc_bind_true = 1,
diff --git a/offload/DeviceRTL/include/Synchronization.h b/offload/DeviceRTL/include/Synchronization.h
index a4d4fc08837b29..96a2f8654e92ab 100644
--- a/offload/DeviceRTL/include/Synchronization.h
+++ b/offload/DeviceRTL/include/Synchronization.h
@@ -192,15 +192,14 @@ void threads(atomic::OrderingTy Ordering);
 /// noinline is removed by the openmp-opt pass and helps to preserve the
 /// information till then.
 ///{
-#pragma omp begin assumes ext_aligned_barrier
 
 /// Synchronize all threads in a block, they are reaching the same instruction
 /// (hence all threads in the block are "aligned"). Also perform a fence before
 /// and after the barrier according to \p Ordering. Note that the
 /// fence might be part of the barrier if the target offers this.
-[[gnu::noinline]] void threadsAligned(atomic::OrderingTy Ordering);
+[[gnu::noinline, omp::assume("ext_aligned_barrier")]] void
+threadsAligned(atomic::OrderingTy Ordering);
 
-#pragma omp end assumes
 ///}
 
 } // namespace synchronize

Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what's going wrong with the unknown-assumption warning?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 20, 2025

Do you know what's going wrong with the unknown-assumption warning?

Yeah, it's just missing.

@jhuber6 jhuber6 merged commit 58af82b into llvm:main Jan 20, 2025
6 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 20, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building offload at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/14152

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: jit/empty_kernel_lvl1.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel_lvl1.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fopenmp-target-jit      -DTGT1_DIRECTIVE="target"
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel_lvl1.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fopenmp-target-jit -DTGT1_DIRECTIVE=target
# note: command had no output on stdout or stderr
# RUN: at line 4
env LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE=/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll          LIBOMPTARGET_JIT_SKIP_OPT=true                        /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp
# executed command: env LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE=/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll LIBOMPTARGET_JIT_SKIP_OPT=true /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp
# note: command had no output on stdout or stderr
# RUN: at line 7
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck --input-file /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc --check-prefix=FIRST
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck --input-file /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc --check-prefix=FIRST
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc:37:16: error: FIRST-NEXT: is not on the line after the previous match
# | // FIRST-NEXT: ret void
# |                ^
# | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll:29:2: note: 'next' match was here
# |  ret void
# |  ^
# | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll:26:80: note: previous match ended here
# | define weak_odr protected amdgpu_kernel void @__omp_offloading_802_b388264_main_l2(ptr noalias noundef %0) local_unnamed_addr #1 {
# |                                                                                ^
# | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll:27:1: note: non-matching line after previous match is here
# |  tail call void @llvm.amdgcn.s.barrier() #2
# | ^
# | 
# | Input file: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll
# | Check file: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |          .
# |          .
# |          .
# |         24:  
# |         25: ; Function Attrs: alwaysinline mustprogress norecurse nounwind 
# |         26: define weak_odr protected amdgpu_kernel void @__omp_offloading_802_b388264_main_l2(ptr noalias noundef %0) local_unnamed_addr #1 { 
# |         27:  tail call void @llvm.amdgcn.s.barrier() #2 
# |         28:  tail call void @llvm.amdgcn.s.barrier() #2 
# |         29:  ret void 
# | next:37      !~~~~~~~  error: match on wrong line
# |         30: } 
# |         31:  
# |         32: attributes #0 = { convergent nocallback nofree nounwind willreturn } 
...

@jhuber6 jhuber6 deleted the RemoveASMExt branch January 20, 2025 14:24
@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 20, 2025

Slight IR change now that we no longer have the attribute, will fix.

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 20, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building offload at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/12027

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: jit/empty_kernel_lvl1.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel_lvl1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fopenmp-target-jit      -DTGT1_DIRECTIVE="target"
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel_lvl1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fopenmp-target-jit -DTGT1_DIRECTIVE=target
# RUN: at line 4
env LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE=/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll          LIBOMPTARGET_JIT_SKIP_OPT=true                        /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp
# executed command: env LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE=/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll LIBOMPTARGET_JIT_SKIP_OPT=true /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp
# RUN: at line 7
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck --input-file /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc --check-prefix=FIRST
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck --input-file /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc --check-prefix=FIRST
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc:37:16: error: FIRST-NEXT: is not on the line after the previous match
# | // FIRST-NEXT: ret void
# |                ^
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll:29:2: note: 'next' match was here
# |  ret void
# |  ^
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll:26:80: note: previous match ended here
# | define weak_odr protected amdgpu_kernel void @__omp_offloading_802_d8282ad_main_l2(ptr noalias noundef %0) local_unnamed_addr #1 {
# |                                                                                ^
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll:27:1: note: non-matching line after previous match is here
# |  tail call void @llvm.amdgcn.s.barrier() #2
# | ^
# | 
# | Input file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/jit/Output/empty_kernel_lvl1.c.tmp.pre.ll
# | Check file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/jit/empty_kernel.inc
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |          .
# |          .
# |          .
# |         24:  
# |         25: ; Function Attrs: alwaysinline mustprogress norecurse nounwind 
# |         26: define weak_odr protected amdgpu_kernel void @__omp_offloading_802_d8282ad_main_l2(ptr noalias noundef %0) local_unnamed_addr #1 { 
# |         27:  tail call void @llvm.amdgcn.s.barrier() #2 
# |         28:  tail call void @llvm.amdgcn.s.barrier() #2 
# |         29:  ret void 
# | next:37      !~~~~~~~  error: match on wrong line
# |         30: } 
# |         31:  
# |         32: attributes #0 = { convergent nocallback nofree nounwind willreturn } 
# |         33: attributes #1 = { alwaysinline mustprogress norecurse nounwind "amdgpu-flat-work-group-size"="1,256" "amdgpu-no-agpr" "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-flat-scratch-init" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "kernel" "no-trapping-math"="true" "omp_target_thread_limit"="256" "stack-protector-buffer-size"="8" "target-cpu"="gfx906" "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot10-insts,+dot2-insts,+dot7-insts,+dpp,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst,+wavefrontsize64" "uniform-work-group-size"="true" } 
# |         34: attributes #2 = { "llvm.assume"="ext_aligned_barrier" } 
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants