-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[llvm][OpenMP] Handle complex types in atomic read #111377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-flang-openmp Author: None (NimishMishra) ChangesThis patch adds functionality for atomically reading Fixes: #93441 Full diff: https://github.com/llvm/llvm-project/pull/111377.diff 2 Files Affected:
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 47cc6ff7655caf..b2c1ea6b805fd6 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -7822,7 +7822,7 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc,
"OMP Atomic expects a pointer to target memory");
Type *XElemTy = X.ElemTy;
assert((XElemTy->isFloatingPointTy() || XElemTy->isIntegerTy() ||
- XElemTy->isPointerTy()) &&
+ XElemTy->isPointerTy() || XElemTy->isStructTy()) &&
"OMP atomic read expected a scalar type");
Value *XRead = nullptr;
@@ -7832,6 +7832,18 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc,
Builder.CreateLoad(XElemTy, X.Var, X.IsVolatile, "omp.atomic.read");
XLD->setAtomic(AO);
XRead = cast<Value>(XLD);
+ } else if (XElemTy->isStructTy()) {
+ LoadInst *OldVal = Builder.CreateLoad(XElemTy, X.Var, "omp.atomic.read");
+ OldVal->setAtomic(AO);
+ const DataLayout &LoadDL = OldVal->getModule()->getDataLayout();
+ unsigned LoadSize =
+ LoadDL.getTypeStoreSize(OldVal->getPointerOperand()->getType());
+ OpenMPIRBuilder::AtomicInfo atomicInfo(
+ &Builder, XElemTy, LoadSize * 8, LoadSize * 8, OldVal->getAlign(),
+ OldVal->getAlign(), true /* UseLibcall */, X.Var);
+ auto AtomicLoadRes = atomicInfo.EmitAtomicLoadLibcall(AO);
+ XRead = AtomicLoadRes.first;
+ OldVal->eraseFromParent();
} else {
// We need to perform atomic op as integer
IntegerType *IntCastTy =
diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
index 5d76e87472dfe4..47a7e0a7634cdc 100644
--- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
@@ -1564,6 +1564,28 @@ llvm.func @_QPomp_atomic_capture_complex() {
// -----
+// CHECK-LABEL: define void @omp_atomic_read_complex() {
+llvm.func @omp_atomic_read_complex(){
+
+// CHECK: %[[a:.*]] = alloca { float, float }, i64 1, align 8
+// CHECK: %[[b:.*]] = alloca { float, float }, i64 1, align 8
+// CHECK: %[[ATOMIC_TEMP_LOAD:.*]] = alloca { float, float }, align 8
+// CHECK: call void @__atomic_load(i64 8, ptr %[[b]], ptr %[[ATOMIC_TEMP_LOAD]], i32 0)
+// CHECK: %[[LOADED_VAL:.*]] = load { float, float }, ptr %[[ATOMIC_TEMP_LOAD]], align 8
+// CHECK: store { float, float } %[[LOADED_VAL]], ptr %[[a]], align 4
+// CHECK: ret void
+// CHECK: }
+
+ %0 = llvm.mlir.constant(1 : i64) : i64
+ %1 = llvm.alloca %0 x !llvm.struct<(f32, f32)> {bindc_name = "ib"} : (i64) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1 : i64) : i64
+ %3 = llvm.alloca %2 x !llvm.struct<(f32, f32)> {bindc_name = "ia"} : (i64) -> !llvm.ptr
+ omp.atomic.read %1 = %3 : !llvm.ptr, !llvm.struct<(f32, f32)>
+ llvm.return
+}
+
+// -----
+
// Checking an order-dependent operation when the order is `expr binop x`
// CHECK-LABEL: @omp_atomic_update_ordering
// CHECK-SAME: (ptr %[[x:.*]], i32 %[[expr:.*]])
|
Thanks for your work on this. One of our tests segfaults at runtime. Here is a minimal reproducer (that should loop forever)
I am testing on aarch64 (aws graviton 3) and using the atomic implementation from compiler-rt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tblah Is the segfault with this patch or without it?
OpenMPIRBuilder::AtomicInfo atomicInfo( | ||
&Builder, XElemTy, LoadSize * 8, LoadSize * 8, OldVal->getAlign(), | ||
OldVal->getAlign(), true /* UseLibcall */, X.Var); | ||
auto AtomicLoadRes = atomicInfo.EmitAtomicLoadLibcall(AO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Libcalls should only be used if the device does not support atomicrmw
of that size. E.g. when the struct is only 4 bytes long, emit atomicrmw
instead of call __atomic_load
.
I think this is part of the larger issue of handling atomics properly. For the immediate problem, this should work, but I suggest to add a TODO/FIXME comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur I have added a FIXME for this point. Thanks. If everything else looks okay, can we merge this?
The segfault is with this patch. Without this patch flang-new hits an assertion failure. |
Thanks @tblah for testing this PR. Without this patch, I believe the assertion would be related to emitting an atomic read on non-integer types. I'll investigate on this and get back. |
Hi @tblah, Thanks for the test. I could reproduce the issue with my patch. GDB shows the segfault is in atomic_load. I have a couple of questions regarding the test that can help in triaging this better: I was not able to reproduce the segfault on the following test, which is strange, because I did not expect the segfault to depend on
Moreover, the following test (where I remove the
Taking a step further, if I remove the
|
Thanks for taking a look.
I have tried this using both gnu libatomic and compiler-rt so I expect the bug is in what we are passing to I get the same results on your three examples:
It isn't clear to me if the runtime crash in my test is the same bug or not. It might be. I have seen those compiler crashes before while making changes to I don't have a machine available to confirm, but I would guess (2) and (3) are reproducible on x86? |
The LLVM IR for your final example right before the code extractor runs is as follows:
See the controlflow graph with My guess is that this is a different bug that the code extractor can't handle infinite loops in the control flow graph - it is looking for live out values from that CFG region but there's no way out of the region. A much simpler example also crashes the compiler:
Surprisingly, equivalent C code does not crash the compiler. In conclusion I think the compiler crash and the executable crash are different bugs. I will create a ticket for the compiler crash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the updates.
I still have a few failing tests on my end that I need to minimize and figure out if they are really bugs with the atomic implementation. But I think in the meantime it is better to merge this because it does work for most cases.
1caff48
to
82df319
Compare
This patch adds functionality for atomically reading
llvm.struct
types.Fixes: #93441