Skip to content

Commit 58af82b

Browse files
jhuber6Meinersbur
andauthored
[OpenMP] Remove 'omp assumes' scopes now that we have no inline ASM (#123611)
Summary: We used this globally scoped `ext_no_call_asm` as a sort of hack around the compiler that allowed the attributor to optimize out inline assembly calls to PTX instructions. Quite some time ago I got rid of every inline assembly call and replaced it with a builitin, so this can just be deleted. Furthermore, I use the `[[omp::assume]]` attribute directly for the aligned barrier usage. This prints an unknown assumption warning (even though it isn't) so I'm just silencing that for now until I fix it later. --------- Co-authored-by: Michael Kruse <[email protected]>
1 parent b5c9cba commit 58af82b

File tree

3 files changed

+3
-12
lines changed

3 files changed

+3
-12
lines changed

offload/DeviceRTL/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden
100100
-nocudalib -nogpulib -nogpuinc -nostdlibinc
101101
-fopenmp -fopenmp-cuda-mode
102102
-Wno-unknown-cuda-version -Wno-openmp-target
103+
-Wno-unknown-assumption # TODO: Fix false-positive warning for ext_aligned_barrier
103104
-DOMPTARGET_DEVICE_RUNTIME
104105
-I${include_directory}
105106
-I${devicertl_base_directory}/../include

offload/DeviceRTL/include/DeviceTypes.h

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,15 +15,6 @@
1515
#include <stddef.h>
1616
#include <stdint.h>
1717

18-
// Tell the compiler that we do not have any "call-like" inline assembly in the
19-
// device rutime. That means we cannot have inline assembly which will call
20-
// another function but only inline assembly that performs some operation or
21-
// side-effect and then continues execution with something on the existing call
22-
// stack.
23-
//
24-
// TODO: Find a good place for this
25-
#pragma omp assumes ext_no_call_asm
26-
2718
enum omp_proc_bind_t {
2819
omp_proc_bind_false = 0,
2920
omp_proc_bind_true = 1,

offload/DeviceRTL/include/Synchronization.h

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -192,15 +192,14 @@ void threads(atomic::OrderingTy Ordering);
192192
/// noinline is removed by the openmp-opt pass and helps to preserve the
193193
/// information till then.
194194
///{
195-
#pragma omp begin assumes ext_aligned_barrier
196195

197196
/// Synchronize all threads in a block, they are reaching the same instruction
198197
/// (hence all threads in the block are "aligned"). Also perform a fence before
199198
/// and after the barrier according to \p Ordering. Note that the
200199
/// fence might be part of the barrier if the target offers this.
201-
[[gnu::noinline]] void threadsAligned(atomic::OrderingTy Ordering);
200+
[[gnu::noinline, omp::assume("ext_aligned_barrier")]] void
201+
threadsAligned(atomic::OrderingTy Ordering);
202202

203-
#pragma omp end assumes
204203
///}
205204

206205
} // namespace synchronize

0 commit comments

Comments
 (0)