[llvm][OpenMP] Handle complex types in atomic read #111377

NimishMishra · 2024-10-07T13:26:19Z

This patch adds functionality for atomically reading llvm.struct types.

Fixes: #93441

llvmbot · 2024-10-07T13:26:55Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-flang-openmp

Author: None (NimishMishra)

Changes

This patch adds functionality for atomically reading llvm.struct types.

Fixes: #93441

Full diff: https://github.com/llvm/llvm-project/pull/111377.diff

2 Files Affected:

(modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+13-1)
(modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+22)

diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 47cc6ff7655caf..b2c1ea6b805fd6 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -7822,7 +7822,7 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc,
          "OMP Atomic expects a pointer to target memory");
   Type *XElemTy = X.ElemTy;
   assert((XElemTy->isFloatingPointTy() || XElemTy->isIntegerTy() ||
-          XElemTy->isPointerTy()) &&
+          XElemTy->isPointerTy() || XElemTy->isStructTy()) &&
          "OMP atomic read expected a scalar type");
 
   Value *XRead = nullptr;
@@ -7832,6 +7832,18 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc,
         Builder.CreateLoad(XElemTy, X.Var, X.IsVolatile, "omp.atomic.read");
     XLD->setAtomic(AO);
     XRead = cast<Value>(XLD);
+  } else if (XElemTy->isStructTy()) {
+    LoadInst *OldVal = Builder.CreateLoad(XElemTy, X.Var, "omp.atomic.read");
+    OldVal->setAtomic(AO);
+    const DataLayout &LoadDL = OldVal->getModule()->getDataLayout();
+    unsigned LoadSize =
+        LoadDL.getTypeStoreSize(OldVal->getPointerOperand()->getType());
+    OpenMPIRBuilder::AtomicInfo atomicInfo(
+        &Builder, XElemTy, LoadSize * 8, LoadSize * 8, OldVal->getAlign(),
+        OldVal->getAlign(), true /* UseLibcall */, X.Var);
+    auto AtomicLoadRes = atomicInfo.EmitAtomicLoadLibcall(AO);
+    XRead = AtomicLoadRes.first;
+    OldVal->eraseFromParent();
   } else {
     // We need to perform atomic op as integer
     IntegerType *IntCastTy =
diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
index 5d76e87472dfe4..47a7e0a7634cdc 100644
--- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
@@ -1564,6 +1564,28 @@ llvm.func @_QPomp_atomic_capture_complex() {
 
 // -----
 
+// CHECK-LABEL: define void @omp_atomic_read_complex() {
+llvm.func @omp_atomic_read_complex(){
+
+// CHECK: %[[a:.*]] = alloca { float, float }, i64 1, align 8
+// CHECK: %[[b:.*]] = alloca { float, float }, i64 1, align 8
+// CHECK: %[[ATOMIC_TEMP_LOAD:.*]] = alloca { float, float }, align 8
+// CHECK: call void @__atomic_load(i64 8, ptr %[[b]], ptr %[[ATOMIC_TEMP_LOAD]], i32 0)
+// CHECK: %[[LOADED_VAL:.*]] = load { float, float }, ptr %[[ATOMIC_TEMP_LOAD]], align 8
+// CHECK: store { float, float } %[[LOADED_VAL]], ptr %[[a]], align 4
+// CHECK: ret void
+// CHECK: }
+
+    %0 = llvm.mlir.constant(1 : i64) : i64
+    %1 = llvm.alloca %0 x !llvm.struct<(f32, f32)> {bindc_name = "ib"} : (i64) -> !llvm.ptr
+    %2 = llvm.mlir.constant(1 : i64) : i64
+    %3 = llvm.alloca %2 x !llvm.struct<(f32, f32)> {bindc_name = "ia"} : (i64) -> !llvm.ptr
+    omp.atomic.read %1 = %3 : !llvm.ptr, !llvm.struct<(f32, f32)>
+    llvm.return
+}
+
+// -----
+
 // Checking an order-dependent operation when the order is `expr binop x`
 // CHECK-LABEL: @omp_atomic_update_ordering
 // CHECK-SAME: (ptr %[[x:.*]], i32 %[[expr:.*]])

tblah · 2024-10-08T12:44:52Z

Thanks for your work on this.

One of our tests segfaults at runtime. Here is a minimal reproducer (that should loop forever)

program complex_read
  complex :: x = 0
  complex v
  !$omp parallel
  do
    !$omp atomic read
      v = x
    if (v /= x) exit
  end do
  !$omp end parallel
end program

I am testing on aarch64 (aws graviton 3) and using the atomic implementation from compiler-rt.

Meinersbur

@tblah Is the segfault with this patch or without it?

Meinersbur · 2024-10-08T14:15:55Z

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

+    OpenMPIRBuilder::AtomicInfo atomicInfo(
+        &Builder, XElemTy, LoadSize * 8, LoadSize * 8, OldVal->getAlign(),
+        OldVal->getAlign(), true /* UseLibcall */, X.Var);
+    auto AtomicLoadRes = atomicInfo.EmitAtomicLoadLibcall(AO);


Libcalls should only be used if the device does not support atomicrmw of that size. E.g. when the struct is only 4 bytes long, emit atomicrmw instead of call __atomic_load.

I think this is part of the larger issue of handling atomics properly. For the immediate problem, this should work, but I suggest to add a TODO/FIXME comment here.

@Meinersbur I have added a FIXME for this point. Thanks. If everything else looks okay, can we merge this?

tblah · 2024-10-08T16:09:20Z

@tblah Is the segfault with this patch or without it?

The segfault is with this patch. Without this patch flang-new hits an assertion failure.

NimishMishra · 2024-10-09T05:25:54Z

Thanks @tblah for testing this PR. Without this patch, I believe the assertion would be related to emitting an atomic read on non-integer types.

I'll investigate on this and get back.

NimishMishra · 2024-10-18T09:00:16Z

Hi @tblah,

Thanks for the test. I could reproduce the issue with my patch.

GDB shows the segfault is in atomic_load. I have a couple of questions regarding the test that can help in triaging this better:

I was not able to reproduce the segfault on the following test, which is strange, because I did not expect the segfault to depend on do

program complex_read
  complex :: x = 0
  complex v
  !$omp parallel
    !$omp atomic read
      v = x
  !$omp end parallel
end program

Moreover, the following test (where I remove the if), I observe a crash.

program complex_read
  complex :: x = 0
  complex v
  !$omp parallel
  do
    !$omp atomic read
      v = x
  end do
  !$omp end parallel
end program

Taking a step further, if I remove the atomic read (i.e. the following test), the crash still happens.

program complex_read
  complex :: x = 0
  complex v
  !$omp parallel
  do
      v = x
  end do
  !$omp end parallel
end program

tblah · 2024-10-18T11:07:44Z

Thanks for taking a look.

GDB shows the segfault is in atomic_load. I have a couple of questions regarding the test that can help in triaging this better:

I have tried this using both gnu libatomic and compiler-rt so I expect the bug is in what we are passing to atomic_load rather than atomic_load itself.

I get the same results on your three examples:

Compiles and runs correctly
compiler crash in llvm::CodeExtractor::findAllocas
compiler crash in llvm::CodeExtractor::findAllocas

It isn't clear to me if the runtime crash in my test is the same bug or not. It might be. I have seen those compiler crashes before while making changes to mlir/lib/Target/LLVMIR/OpenMPToLLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp - if we generate malformed IR in the callbacks passed into OpenMPIRBuilder the code extractor can crash while trying to outline the openmp region.

I don't have a machine available to confirm, but I would guess (2) and (3) are reproducible on x86?

tblah · 2024-10-18T11:30:50Z

The LLVM IR for your final example right before the code extractor runs is as follows:

define void @_QQmain() #0 {
  %tid.addr = alloca i32, align 4
  %zero.addr = alloca i32, align 4
  %1 = alloca { float, float }, i64 1, align 8
  br label %entry

entry:                                            ; preds = %0
  %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @1)
  br label %omp.par.entry

omp.par.entry:                                    ; preds = %entry
  %tid.addr.local = alloca i32, align 4
  %tid = load i32, ptr %tid.addr.local, align 4
  %tid.addr.use = load i32, ptr %tid.addr, align 4
  %zero.addr.use = load i32, ptr %zero.addr, align 4
  br label %omp.reduction.init

omp.reduction.init:                               ; preds = %omp.par.entry
  br label %omp.par.region

omp.par.region:                                   ; preds = %omp.reduction.init
  br label %omp.par.region1

omp.par.region2:                                  ; preds = %omp.par.region2, %omp.par.region1
  %2 = load { float, float }, ptr @_QFEx, align 4
  store { float, float } %2, ptr %1, align 4
  br label %omp.par.region2

omp.par.region1:                                  ; preds = %omp.par.region
  br label %omp.par.region2

omp.region.cont:                                  ; No predecessors!
  br label %omp.par.pre_finalize

omp.par.pre_finalize:                             ; preds = %omp.region.cont
  br label %omp.par.outlined.exit

omp.par.outlined.exit:                            ; preds = %omp.par.pre_finalize
  br label %omp.par.exit.split

omp.par.exit.split:                               ; preds = %omp.par.outlined.exit
  unreachable
}

See the controlflow graph with omp.par.region2 unconditionally branching to itself for the infinite do loop.

My guess is that this is a different bug that the code extractor can't handle infinite loops in the control flow graph - it is looking for live out values from that CFG region but there's no way out of the region.

A much simpler example also crashes the compiler:

program infinite_loop
  integer :: i
  !$omp parallel
  do
    i = 0
  end do
  !$omp end parallel
end program

Surprisingly, equivalent C code does not crash the compiler.

In conclusion I think the compiler crash and the executable crash are different bugs. I will create a ticket for the compiler crash

NimishMishra · 2024-10-19T13:48:13Z

Hi @tblah,

I have tested out this PR with working on #112908, and I do not see segfault (once we remove do). Is atomic read still segfaulting on aarch64?

tblah

Thanks for all the updates.

I still have a few failing tests on my end that I need to minimize and figure out if they are really bugs with the atomic implementation. But I think in the meantime it is better to merge this because it does work for most cases.

NimishMishra requested review from ergawy, Meinersbur, tblah, kiranchandramohan, harishch4, luporl and kiranktp October 7, 2024 13:26

llvmbot added mlir:llvm mlir flang:openmp clang:openmp OpenMP related changes to Clang labels Oct 7, 2024

Meinersbur reviewed Oct 8, 2024

View reviewed changes

NimishMishra mentioned this pull request Oct 19, 2024

[flang][OpenMP] support arrays of atomic numbers #112908

Closed

tblah approved these changes Oct 21, 2024

View reviewed changes

[llvm][OpenMP] Handle complex types in atomic read

82df319

NimishMishra force-pushed the atomic-read-complex branch from 1caff48 to 82df319 Compare October 21, 2024 14:09

Add FIXME

db1731b

NimishMishra merged commit 645e6f1 into llvm:main Oct 23, 2024
8 checks passed

This was referenced Oct 23, 2024

[flang][OpenMP] atomic write on complex #113479

Closed

[Flang][OpenMP] Compilation error when the same type (complex) is used in capture-statement in atomic read construct #94926

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llvm][OpenMP] Handle complex types in atomic read #111377

[llvm][OpenMP] Handle complex types in atomic read #111377

Uh oh!

NimishMishra commented Oct 7, 2024

Uh oh!

llvmbot commented Oct 7, 2024 •

edited

Loading

Uh oh!

tblah commented Oct 8, 2024

Uh oh!

Meinersbur left a comment

Uh oh!

Meinersbur Oct 8, 2024

Uh oh!

NimishMishra Oct 21, 2024

Uh oh!

tblah commented Oct 8, 2024

Uh oh!

NimishMishra commented Oct 9, 2024

Uh oh!

NimishMishra commented Oct 18, 2024 •

edited

Loading

Uh oh!

tblah commented Oct 18, 2024

Uh oh!

tblah commented Oct 18, 2024 •

edited

Loading

Uh oh!

NimishMishra commented Oct 19, 2024

Uh oh!

tblah left a comment

Uh oh!

Uh oh!

Uh oh!

[llvm][OpenMP] Handle complex types in atomic read #111377

[llvm][OpenMP] Handle complex types in atomic read #111377

Uh oh!

Conversation

NimishMishra commented Oct 7, 2024

Uh oh!

llvmbot commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tblah commented Oct 8, 2024

Uh oh!

Meinersbur left a comment

Choose a reason for hiding this comment

Uh oh!

Meinersbur Oct 8, 2024

Choose a reason for hiding this comment

Uh oh!

NimishMishra Oct 21, 2024

Choose a reason for hiding this comment

Uh oh!

tblah commented Oct 8, 2024

Uh oh!

NimishMishra commented Oct 9, 2024

Uh oh!

NimishMishra commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tblah commented Oct 18, 2024

Uh oh!

tblah commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NimishMishra commented Oct 19, 2024

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Oct 7, 2024 •

edited

Loading

NimishMishra commented Oct 18, 2024 •

edited

Loading

tblah commented Oct 18, 2024 •

edited

Loading