Skip to content

Commit 17b5445

Browse files
authored
[Libomptarget] Add a wavefront sync builtin for the AMDGPU implementation (#70228)
Summary: While this is technically a no-op for AMDGPU hardware, in cases where the user would see fit to add an explicit wavefront sync on Nvidia hardware, we should also inform the LLVM optimizer that this control flow is convergent so we do not reorder blocks.
1 parent e3d2a7d commit 17b5445

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

openmp/libomptarget/DeviceRTL/src/Synchronization.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -272,7 +272,9 @@ void fenceSystem(atomic::OrderingTy Ordering) {
272272
}
273273

274274
void syncWarp(__kmpc_impl_lanemask_t) {
275-
// AMDGCN doesn't need to sync threads in a warp
275+
// This is a no-op on current AMDGPU hardware but it is used by the optimizer
276+
// to enforce convergent behaviour between control flow graphs.
277+
__builtin_amdgcn_wave_barrier();
276278
}
277279

278280
void syncThreads(atomic::OrderingTy Ordering) {

0 commit comments

Comments
 (0)