[llvm][OpenMP] Add implicit cast to omp.atomic.read #114659

NimishMishra · 2024-11-02T10:14:18Z

Should the operands of omp.atomic.read differ, emit an implicit cast. In case of struct arguments, extract the 0-th index, emit an implicit cast if required, and store at the destination.

Fixes #112908

llvmbot · 2024-11-02T10:14:51Z

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-flang-openmp

Author: None (NimishMishra)

Changes

Should the operands of omp.atomic.read differ, emit an implicit cast. In case of struct arguments, extract the 0-th index, emit an implicit cast if required, and store at the destination.

Fixes #112908

Full diff: https://github.com/llvm/llvm-project/pull/114659.diff

2 Files Affected:

(modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+30)
(modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+71)

diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index d2e4dc1c85dfd2..9e69b730884ff6 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -265,6 +265,32 @@ computeOpenMPScheduleType(ScheduleKind ClauseKind, bool HasChunks,
   return Result;
 }
 
+/// Emit an implicit cast to convert \p XRead to type of variable \p V
+static llvm::Value *emitImplicitCast(IRBuilder<> &Builder, llvm::Value *XRead,
+                                     llvm::Value *V) {
+  llvm::Type *XReadType = XRead->getType();
+  llvm::Type *VType = V->getType();
+  if (llvm::AllocaInst *vAlloca = dyn_cast<llvm::AllocaInst>(V))
+    VType = vAlloca->getAllocatedType();
+
+  if (XReadType->isStructTy() && VType->isStructTy())
+    // No need to extract or convert. A direct
+    // `store` will suffice.
+    return XRead;
+
+  if (XRead->getType()->isStructTy())
+    XRead = Builder.CreateExtractValue(XRead, /*Idxs=*/0);
+  if (VType->isIntegerTy() && XReadType->isFloatingPointTy())
+    XRead = Builder.CreateFPToSI(XRead, VType);
+  else if (VType->isFloatingPointTy() && XReadType->isIntegerTy())
+    XRead = Builder.CreateSIToFP(XRead, VType);
+  else if (VType->isIntegerTy() && XReadType->isIntegerTy())
+    XRead = Builder.CreateIntCast(XRead, VType, true);
+  else if (VType->isFloatingPointTy() && XReadType->isFloatingPointTy())
+    XRead = Builder.CreateFPCast(XRead, VType);
+  return XRead;
+}
+
 /// Make \p Source branch to \p Target.
 ///
 /// Handles two situations:
@@ -8076,6 +8102,8 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc,
     }
   }
   checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Read);
+  if (XRead->getType() != V.Var->getType())
+    XRead = emitImplicitCast(Builder, XRead, V.Var);
   Builder.CreateStore(XRead, V.Var, V.IsVolatile);
   return Builder.saveIP();
 }
@@ -8360,6 +8388,8 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicCapture(
     return AtomicResult.takeError();
   Value *CapturedVal =
       (IsPostfixUpdate ? AtomicResult->first : AtomicResult->second);
+  if (CapturedVal->getType() != V.Var->getType())
+    CapturedVal = emitImplicitCast(Builder, CapturedVal, V.Var);
   Builder.CreateStore(CapturedVal, V.Var, V.IsVolatile);
 
   checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Capture);
diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
index 49f9f3562c78b5..68dd9a66652b1c 100644
--- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
@@ -1368,6 +1368,77 @@ llvm.func @omp_atomic_read(%arg0 : !llvm.ptr, %arg1 : !llvm.ptr) -> () {
 
 // -----
 
+// CHECK-LABEL: @omp_atomic_read_implicit_cast
+llvm.func @omp_atomic_read_implicit_cast () {
+//CHECK: %[[Z:.*]] = alloca float, i64 1, align 4
+//CHECK: %[[Y:.*]] = alloca double, i64 1, align 8
+//CHECK: %[[X:.*]] = alloca [2 x { float, float }], i64 1, align 8
+//CHECK: %[[W:.*]] = alloca i32, i64 1, align 4
+//CHECK: %[[X_ELEMENT:.*]] = getelementptr { float, float }, ptr %3, i64 0
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x f32 {bindc_name = "z"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1 : i64) : i64
+  %3 = llvm.alloca %2 x f64 {bindc_name = "y"} : (i64) -> !llvm.ptr
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  %5 = llvm.alloca %4 x !llvm.array<2 x struct<(f32, f32)>> {bindc_name = "x"} : (i64) -> !llvm.ptr
+  %6 = llvm.mlir.constant(1 : i64) : i64
+  %7 = llvm.alloca %6 x i32 {bindc_name = "w"} : (i64) -> !llvm.ptr
+  %8 = llvm.mlir.constant(1 : index) : i64
+  %9 = llvm.mlir.constant(2 : index) : i64
+  %10 = llvm.mlir.constant(1 : i64) : i64
+  %11 = llvm.mlir.constant(0 : i64) : i64
+  %12 = llvm.sub %8, %10 overflow<nsw> : i64
+  %13 = llvm.mul %12, %10 overflow<nsw> : i64
+  %14 = llvm.mul %13, %10 overflow<nsw> : i64
+  %15 = llvm.add %14, %11 overflow<nsw> : i64
+  %16 = llvm.mul %10, %9 overflow<nsw> : i64
+  %17 = llvm.getelementptr %5[%15] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.struct<(f32, f32)>
+
+//CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = alloca { float, float }, align 8
+//CHECK: call void @__atomic_load(i64 8, ptr %[[X_ELEMENT]], ptr %[[ATOMIC_LOAD_TEMP]], i32 0)
+//CHECK: %[[LOAD:.*]] = load { float, float }, ptr %[[ATOMIC_LOAD_TEMP]], align 8
+//CHECK: %[[EXT:.*]] = extractvalue { float, float } %[[LOAD]], 0
+//CHECK: store float %[[EXT]], ptr %[[Y]], align 4
+  omp.atomic.read %3 = %17 : !llvm.ptr, !llvm.struct<(f32, f32)>
+
+//CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4
+//CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float
+//CHECK: %[[LOAD:.*]] = fpext float %[[CAST]] to double
+//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8
+  omp.atomic.read %3 = %1 : !llvm.ptr, f32
+
+//CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4
+//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to double
+//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8
+  omp.atomic.read %3 = %7 : !llvm.ptr, i32
+
+//CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4
+//CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double
+//CHECK: %[[LOAD:.*]] = fptrunc double %[[CAST]] to float
+//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4
+  omp.atomic.read %1 = %3 : !llvm.ptr, f64
+
+//CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4
+//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to float
+//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4
+  omp.atomic.read %1 = %7 : !llvm.ptr, i32
+
+//CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4
+//CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double
+//CHECK: %[[LOAD:.*]] = fptosi double %[[CAST]] to i32
+//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4
+  omp.atomic.read %7 = %3 : !llvm.ptr, f64
+
+//CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4
+//CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float
+//CHECK: %[[LOAD:.*]] = fptosi float %[[CAST]] to i32
+//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4
+  omp.atomic.read %7 = %1 : !llvm.ptr, f32
+  llvm.return
+}
+
+// -----
+
 // CHECK-LABEL: @omp_atomic_write
 // CHECK-SAME: (ptr %[[x:.*]], i32 %[[expr:.*]])
 llvm.func @omp_atomic_write(%x: !llvm.ptr, %expr: i32) -> () {

tblah

Thanks for circling back and fixing this. I have a few minor comments

tblah · 2024-11-04T10:35:41Z

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

+  llvm::Type *XReadType = XRead->getType();
+  llvm::Type *VType = V->getType();
+  if (llvm::AllocaInst *vAlloca = dyn_cast<llvm::AllocaInst>(V))
+    VType = vAlloca->getAllocatedType();


Why would AllocaInst::getAllocatedType be different to the value's type?

Because the type of atomic pointer by default is ptr. We would like to know the real type (i.e. AllocatedType) in order to know whether we need a conversion or an extraction (in case of complex type).

tblah · 2024-11-04T10:37:08Z

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

+    // `store` will suffice.
+    return XRead;
+
+  if (XRead->getType()->isStructTy())


Why not use XReadType here?

tblah · 2024-11-04T10:41:37Z

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

+
+  if (XRead->getType()->isStructTy())
+    XRead = Builder.CreateExtractValue(XRead, /*Idxs=*/0);
+  if (VType->isIntegerTy() && XReadType->isFloatingPointTy())


nit: I think this can be else if

Actually if you consider the test case:

program atomic_array complex :: x(2) integer :: y x(1) = (1.0, 1.0) x(2) = (2.0, 2.0) !$omp parallel !$omp atomic read y = x(1) !$omp end parallel print *, y end program

Then we need to extract the real component of complex into a float, and store it to integer. We thus need this check in place to ensure the conversion takes place.

Meinersbur

I wonder we should use #101966 instead which handles type punning consistently by itself (https://github.com/llvm/llvm-project/blob/ee84c6bb3146d25f21c14d6d8e0abb794683f5ec/llvm/lib/Transforms/Utils/BuildBuiltins.cpp#L123-L131)

Meinersbur · 2024-11-04T13:38:16Z

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

+    XRead = Builder.CreateFPToSI(XRead, VType);
+  else if (VType->isFloatingPointTy() && XReadType->isIntegerTy())
+    XRead = Builder.CreateSIToFP(XRead, VType);


An atomic read should not have implicit semantics of converting between int and fp. That could cause some trouble, e.g. rounding.

Meinersbur · 2024-11-04T13:39:20Z

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

+  if (XRead->getType() != V.Var->getType())
+    XRead = emitImplicitCast(Builder, XRead, V.Var);


The code above seems to already handle casting?

NimishMishra · 2024-11-04T14:53:54Z

I wonder we should use #101966 instead which handles type punning consistently by itself

I think we can. But how do go ahead with changes in Clang codegen? Will there be opposition from community if we try to merge a solution like #101966?

Meinersbur · 2024-11-05T13:36:42Z

I would remove the changes to Clang and AtomicsExpandPass (in the first iteration). Those were for checking whether the API is sufficiently general.

Meinersbur · 2024-11-07T18:12:53Z

I added emitAtomicLoadBuiltin and emitAtomicStoreBuiltin into #101966, and use it within OpenMPIRBuilder. The patch is not ready yet, tests are not updated, tits probably not yet completely correct, and code duplication needs to be reduced.

However, I hope the benfits become clear:

Just one case instead of multiple depending on data type.
All data types (incl. complete) able to use the cmpxchg,load,store,atomicrmw etc instead of the libcall fallback. Only the data size matters, not the type.

NimishMishra · 2024-11-18T10:13:50Z

I added emitAtomicLoadBuiltin and emitAtomicStoreBuiltin into #101966, and use it within OpenMPIRBuilder. The patch is not ready yet, tests are not updated, tits probably not yet completely correct, and code duplication needs to be reduced.

Thanks for your work on this. I went through the patch, and it does seem this approach is better. I see this PR can also help with sized libcalls.

I wonder whether work on this needs a separate PR, or whether we can merge with the current PR? Since, it is more generic and solves multiple issues at once. I am okay with picking this work and opening a new PR, where we can discuss improvements and iterate over it. Let me know what you think.

NimishMishra · 2024-11-29T09:25:17Z

I wonder whether work on this needs a separate PR, or whether we can merge with the current PR? Since, it is more generic and solves multiple issues at once. I am okay with picking this work and opening a new PR, where we can discuss improvements and iterate over it. Let me know what you think.

I discussed this with @kiranktp. We will work on #101966 as a separate PR, since this approach can help with multiple issues (like sized libcalls in atomic).

If everyone is okay, we can merge this current PR for the time being. This would unblock progress towards removal of the experimental message currently. I'll revise this PR with @tblah's comments.

tblah · 2025-01-14T10:21:44Z

@NimishMishra is this patch still needed?

NimishMishra · 2025-01-14T11:51:14Z

@NimishMishra is this patch still needed?

I'm sorry; this fell off my radar. Yes, without this, the test-cases in #112908 will segfault. I have addressed your comments, and have added a TODO based on what Michael suggested (and added that to our internal tracking system too). I'll work on integrating Michael's solution. In the meantime, we can probably go with this.

tblah

LGTM, thanks!

NimishMishra requested review from ergawy, Meinersbur, tblah, kiranchandramohan and kiranktp November 2, 2024 10:14

llvmbot added mlir:llvm mlir flang:openmp clang:openmp OpenMP related changes to Clang labels Nov 2, 2024

NimishMishra changed the title ~~[llvm] Add implicit cast to omp.atomic.read~~ [llvm][OpenMP] Add implicit cast to omp.atomic.read Nov 2, 2024

tblah reviewed Nov 4, 2024

View reviewed changes

Meinersbur reviewed Nov 4, 2024

View reviewed changes

[llvm][OpenMP] Add implicit cast to omp.atomic.read

dcdc2b7

NimishMishra force-pushed the atomic_read_implicit_cast branch from 08c34da to dcdc2b7 Compare January 14, 2025 11:48

tblah approved these changes Jan 15, 2025

View reviewed changes

NimishMishra merged commit d7e48fb into llvm:main Jan 17, 2025
8 checks passed

NimishMishra deleted the atomic_read_implicit_cast branch January 17, 2025 09:40

		if (XRead->getType() != V.Var->getType())
		XRead = emitImplicitCast(Builder, XRead, V.Var);

[llvm][OpenMP] Add implicit cast to omp.atomic.read #114659

[llvm][OpenMP] Add implicit cast to omp.atomic.read #114659

Uh oh!

Conversation

NimishMishra commented Nov 2, 2024

Uh oh!

llvmbot commented Nov 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

tblah Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

NimishMishra Jan 14, 2025

Choose a reason for hiding this comment

Uh oh!

tblah Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

tblah Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

NimishMishra Jan 14, 2025

Choose a reason for hiding this comment

Uh oh!

Meinersbur left a comment

Choose a reason for hiding this comment

Uh oh!

Meinersbur Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

Meinersbur Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

NimishMishra commented Nov 4, 2024

Uh oh!

Meinersbur commented Nov 5, 2024

Uh oh!

Meinersbur commented Nov 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NimishMishra commented Nov 18, 2024

Uh oh!

NimishMishra commented Nov 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tblah commented Jan 14, 2025

Uh oh!

NimishMishra commented Jan 14, 2025

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Nov 2, 2024 •

edited

Loading

Meinersbur commented Nov 7, 2024 •

edited

Loading

NimishMishra commented Nov 29, 2024 •

edited

Loading