Skip to content

[mlir][OpenMP] Convert omp.cancel sections to LLVMIR #137193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 29, 2025

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Apr 24, 2025

This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.

@llvmbot
Copy link
Member

llvmbot commented Apr 24, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-openmp

@llvm/pr-subscribers-flang-openmp

Author: Tom Eccles (tblah)

Changes

This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.


Full diff: https://github.com/llvm/llvm-project/pull/137193.diff

4 Files Affected:

  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+17-10)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+4-2)
  • (modified) mlir/test/Target/LLVMIR/openmp-cancel.mlir (+76)
  • (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-16)
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index be05f01c94603..3f19088e6c73d 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -2172,6 +2172,9 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
   if (!updateToLocation(Loc))
     return Loc.IP;
 
+  // FiniCBWrapper needs to create a branch to the loop finalization block, but
+  // this has not been created yet at some times when this callback runs.
+  SmallVector<BranchInst *> CancellationBranches;
   auto FiniCBWrapper = [&](InsertPointTy IP) {
     if (IP.getBlock()->end() != IP.getPoint())
       return FiniCB(IP);
@@ -2179,16 +2182,9 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
     // will fail because that function requires the Finalization Basic Block to
     // have a terminator, which is already removed by EmitOMPRegionBody.
     // IP is currently at cancelation block.
-    // We need to backtrack to the condition block to fetch
-    // the exit block and create a branch from cancelation
-    // to exit block.
-    IRBuilder<>::InsertPointGuard IPG(Builder);
-    Builder.restoreIP(IP);
-    auto *CaseBB = IP.getBlock()->getSinglePredecessor();
-    auto *CondBB = CaseBB->getSinglePredecessor()->getSinglePredecessor();
-    auto *ExitBB = CondBB->getTerminator()->getSuccessor(1);
-    Instruction *I = Builder.CreateBr(ExitBB);
-    IP = InsertPointTy(I->getParent(), I->getIterator());
+    BranchInst *DummyBranch = Builder.CreateBr(IP.getBlock());
+    IP = InsertPointTy(DummyBranch->getParent(), DummyBranch->getIterator());
+    CancellationBranches.push_back(DummyBranch);
     return FiniCB(IP);
   };
 
@@ -2251,6 +2247,9 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
     return WsloopIP.takeError();
   InsertPointTy AfterIP = *WsloopIP;
 
+  BasicBlock *LoopFini = AfterIP.getBlock()->getSinglePredecessor();
+  assert(LoopFini && "Bad structure of static workshare loop finalization");
+
   // Apply the finalization callback in LoopAfterBB
   auto FiniInfo = FinalizationStack.pop_back_val();
   assert(FiniInfo.DK == OMPD_sections &&
@@ -2264,6 +2263,14 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
     AfterIP = {FiniBB, FiniBB->begin()};
   }
 
+  // Now we can fix the dummy branch to point to the right place
+  if (!CancellationBranches.empty()) {
+    for (BranchInst *DummyBranch : CancellationBranches) {
+      assert(DummyBranch->getNumSuccessors() == 1);
+      DummyBranch->setSuccessor(0, LoopFini);
+    }
+  }
+
   return AfterIP;
 }
 
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 6185a433a8199..d1885641f389d 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -161,7 +161,8 @@ static LogicalResult checkImplementationStatus(Operation &op) {
   auto checkCancelDirective = [&todo](auto op, LogicalResult &result) {
     omp::ClauseCancellationConstructType cancelledDirective =
         op.getCancelDirective();
-    if (cancelledDirective != omp::ClauseCancellationConstructType::Parallel)
+    if (cancelledDirective != omp::ClauseCancellationConstructType::Parallel &&
+        cancelledDirective != omp::ClauseCancellationConstructType::Sections)
       result = todo("cancel directive");
   };
   auto checkDepend = [&todo](auto op, LogicalResult &result) {
@@ -1690,10 +1691,11 @@ convertOmpSections(Operation &opInst, llvm::IRBuilderBase &builder,
   auto finiCB = [&](InsertPointTy codeGenIP) { return llvm::Error::success(); };
 
   allocaIP = findAllocaInsertPoint(builder, moduleTranslation);
+  bool isCancellable = constructIsCancellable(sectionsOp);
   llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
   llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterIP =
       moduleTranslation.getOpenMPBuilder()->createSections(
-          ompLoc, allocaIP, sectionCBs, privCB, finiCB, false,
+          ompLoc, allocaIP, sectionCBs, privCB, finiCB, isCancellable,
           sectionsOp.getNowait());
 
   if (failed(handleError(afterIP, opInst)))
diff --git a/mlir/test/Target/LLVMIR/openmp-cancel.mlir b/mlir/test/Target/LLVMIR/openmp-cancel.mlir
index 1f67d6ceb34af..fca16b936fc85 100644
--- a/mlir/test/Target/LLVMIR/openmp-cancel.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-cancel.mlir
@@ -80,3 +80,79 @@ llvm.func @cancel_parallel_if(%arg0 : i1) {
 // CHECK:         br label %[[VAL_23]]
 // CHECK:       omp.par.exit.exitStub:                            ; preds = %[[VAL_31]], %[[VAL_26]]
 // CHECK:         ret void
+
+llvm.func @cancel_sections_if(%cond : i1) {
+  omp.sections {
+    omp.section {
+      omp.cancel cancellation_construct_type(sections) if(%cond)
+      omp.terminator
+    }
+    omp.terminator
+  }
+  llvm.return
+}
+// CHECK-LABEL: define void @cancel_sections_if
+// CHECK:         %[[VAL_0:.*]] = alloca i32, align 4
+// CHECK:         %[[VAL_1:.*]] = alloca i32, align 4
+// CHECK:         %[[VAL_2:.*]] = alloca i32, align 4
+// CHECK:         %[[VAL_3:.*]] = alloca i32, align 4
+// CHECK:         br label %[[VAL_4:.*]]
+// CHECK:       entry:                                            ; preds = %[[VAL_5:.*]]
+// CHECK:         br label %[[VAL_6:.*]]
+// CHECK:       omp_section_loop.preheader:                       ; preds = %[[VAL_4]]
+// CHECK:         store i32 0, ptr %[[VAL_1]], align 4
+// CHECK:         store i32 0, ptr %[[VAL_2]], align 4
+// CHECK:         store i32 1, ptr %[[VAL_3]], align 4
+// CHECK:         %[[VAL_7:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
+// CHECK:         call void @__kmpc_for_static_init_4u(ptr @1, i32 %[[VAL_7]], i32 34, ptr %[[VAL_0]], ptr %[[VAL_1]], ptr %[[VAL_2]], ptr %[[VAL_3]], i32 1, i32 0)
+// CHECK:         %[[VAL_8:.*]] = load i32, ptr %[[VAL_1]], align 4
+// CHECK:         %[[VAL_9:.*]] = load i32, ptr %[[VAL_2]], align 4
+// CHECK:         %[[VAL_10:.*]] = sub i32 %[[VAL_9]], %[[VAL_8]]
+// CHECK:         %[[VAL_11:.*]] = add i32 %[[VAL_10]], 1
+// CHECK:         br label %[[VAL_12:.*]]
+// CHECK:       omp_section_loop.header:                          ; preds = %[[VAL_13:.*]], %[[VAL_6]]
+// CHECK:         %[[VAL_14:.*]] = phi i32 [ 0, %[[VAL_6]] ], [ %[[VAL_15:.*]], %[[VAL_13]] ]
+// CHECK:         br label %[[VAL_16:.*]]
+// CHECK:       omp_section_loop.cond:                            ; preds = %[[VAL_12]]
+// CHECK:         %[[VAL_17:.*]] = icmp ult i32 %[[VAL_14]], %[[VAL_11]]
+// CHECK:         br i1 %[[VAL_17]], label %[[VAL_18:.*]], label %[[VAL_19:.*]]
+// CHECK:       omp_section_loop.body:                            ; preds = %[[VAL_16]]
+// CHECK:         %[[VAL_20:.*]] = add i32 %[[VAL_14]], %[[VAL_8]]
+// CHECK:         %[[VAL_21:.*]] = mul i32 %[[VAL_20]], 1
+// CHECK:         %[[VAL_22:.*]] = add i32 %[[VAL_21]], 0
+// CHECK:         switch i32 %[[VAL_22]], label %[[VAL_23:.*]] [
+// CHECK:           i32 0, label %[[VAL_24:.*]]
+// CHECK:         ]
+// CHECK:       omp_section_loop.body.case:                       ; preds = %[[VAL_18]]
+// CHECK:         br label %[[VAL_25:.*]]
+// CHECK:       omp.section.region:                               ; preds = %[[VAL_24]]
+// CHECK:         br i1 %[[VAL_26:.*]], label %[[VAL_27:.*]], label %[[VAL_28:.*]]
+// CHECK:       9:                                                ; preds = %[[VAL_25]]
+// CHECK:         %[[VAL_29:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
+// CHECK:         %[[VAL_30:.*]] = call i32 @__kmpc_cancel(ptr @1, i32 %[[VAL_29]], i32 3)
+// CHECK:         %[[VAL_31:.*]] = icmp eq i32 %[[VAL_30]], 0
+// CHECK:         br i1 %[[VAL_31]], label %[[VAL_32:.*]], label %[[VAL_33:.*]]
+// CHECK:       .split:                                           ; preds = %[[VAL_27]]
+// CHECK:         br label %[[VAL_34:.*]]
+// CHECK:       12:                                               ; preds = %[[VAL_25]]
+// CHECK:         br label %[[VAL_34]]
+// CHECK:       13:                                               ; preds = %[[VAL_28]], %[[VAL_32]]
+// CHECK:         br label %[[VAL_35:.*]]
+// CHECK:       omp.region.cont:                                  ; preds = %[[VAL_34]]
+// CHECK:         br label %[[VAL_23]]
+// CHECK:       omp_section_loop.body.sections.after:             ; preds = %[[VAL_35]], %[[VAL_18]]
+// CHECK:         br label %[[VAL_13]]
+// CHECK:       omp_section_loop.inc:                             ; preds = %[[VAL_23]]
+// CHECK:         %[[VAL_15]] = add nuw i32 %[[VAL_14]], 1
+// CHECK:         br label %[[VAL_12]]
+// CHECK:       omp_section_loop.exit:                            ; preds = %[[VAL_33]], %[[VAL_16]]
+// CHECK:         call void @__kmpc_for_static_fini(ptr @1, i32 %[[VAL_7]])
+// CHECK:         %[[VAL_36:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
+// CHECK:         call void @__kmpc_barrier(ptr @2, i32 %[[VAL_36]])
+// CHECK:         br label %[[VAL_37:.*]]
+// CHECK:       omp_section_loop.after:                           ; preds = %[[VAL_19]]
+// CHECK:         br label %[[VAL_38:.*]]
+// CHECK:       omp_section_loop.aftersections.fini:              ; preds = %[[VAL_37]]
+// CHECK:         ret void
+// CHECK:       .cncl:                                            ; preds = %[[VAL_27]]
+// CHECK:         br label %[[VAL_19]]
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index bf251ac2b7d0a..6f904d0647285 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -42,22 +42,6 @@ llvm.func @cancel_wsloop(%lb : i32, %ub : i32, %step: i32) {
 
 // -----
 
-llvm.func @cancel_sections() {
-  // expected-error@below {{LLVM Translation failed for operation: omp.sections}}
-  omp.sections {
-    omp.section {
-      // expected-error@below {{not yet implemented: Unhandled clause cancel directive in omp.cancel operation}}
-      // expected-error@below {{LLVM Translation failed for operation: omp.cancel}}
-      omp.cancel cancellation_construct_type(sections)
-      omp.terminator
-    }
-    omp.terminator
-  }
-  llvm.return
-}
-
-// -----
-
 llvm.func @cancel_taskgroup() {
   // expected-error@below {{LLVM Translation failed for operation: omp.taskgroup}}
   omp.taskgroup {

@tblah
Copy link
Contributor Author

tblah commented Apr 24, 2025

@tblah tblah force-pushed the users/tblah/omp-cancel-codegen-1 branch from aa2445b to 4d28bd8 Compare April 25, 2025 14:17
@tblah tblah force-pushed the users/tblah/omp-cancel-codegen-0 branch from 3a13116 to 4fb5d42 Compare April 26, 2025 11:47
@tblah tblah force-pushed the users/tblah/omp-cancel-codegen-1 branch from 4d28bd8 to f3b8eeb Compare April 26, 2025 11:48
Base automatically changed from users/tblah/omp-cancel-codegen-0 to main April 28, 2025 09:33
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Apr 28, 2025
tblah added 2 commits April 28, 2025 15:09
This is quite ugly but it is the best I could think of. The old
FiniCBWrapper was way too brittle depending upon the exact block
structure inside of the section, and could be confused by any control
flow in the section (e.g. an if clause on cancel). The wording in the
comment and variable names didn't seem to match where it was actually
branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections
branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to
run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy
block, which is then fixed later once all of the information is
available.
This conflicted with my hack doing the same thing: leading to a
use-after-free of the old terminator. I think it is more helpful to do
this in OpenMPIRBuilder than replicating the hack in callbacks in both
clang and mlir.
@tblah tblah force-pushed the users/tblah/omp-cancel-codegen-1 branch from 60592cf to 6c678b7 Compare April 28, 2025 15:09
@tblah
Copy link
Contributor Author

tblah commented Apr 28, 2025

It turned out that the Windows failures were due to some UB in my patch. Clang (with the OMPIRBuilder backend) was actually doing almost exactly the same hack as I implement in this patch, which then interacted badly by my attempts to do the same thing in OMPIRBuilder.

I have removed the clang code because I think it is better to have the hack in one place in OMPIRBuilder instead of in two different callbacks.

This should be ready for review again.

@tblah
Copy link
Contributor Author

tblah commented Apr 28, 2025

The test failures look unrelated to my patch.

Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -2264,6 +2263,14 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
AfterIP = {FiniBB, FiniBB->begin()};
}

// Now we can fix the dummy branch to point to the right place
if (!CancellationBranches.empty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Condition is unnecessary: If CancellationBranches is empty, the loop will not execute anyway

@tblah tblah merged commit 7b70fc7 into main Apr 29, 2025
6 of 9 checks passed
@tblah tblah deleted the users/tblah/omp-cancel-codegen-1 branch April 29, 2025 16:19
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This is quite ugly but it is the best I could think of. The old
FiniCBWrapper was way too brittle depending upon the exact block
structure inside of the section, and could be confused by any control
flow in the section (e.g. an if clause on cancel). The wording in the
comment and variable names didn't seem to match where it was actually
branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections
branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to
run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy
block, which is then fixed later once all of the information is
available.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This is quite ugly but it is the best I could think of. The old
FiniCBWrapper was way too brittle depending upon the exact block
structure inside of the section, and could be confused by any control
flow in the section (e.g. an if clause on cancel). The wording in the
comment and variable names didn't seem to match where it was actually
branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections
branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to
run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy
block, which is then fixed later once all of the information is
available.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This is quite ugly but it is the best I could think of. The old
FiniCBWrapper was way too brittle depending upon the exact block
structure inside of the section, and could be confused by any control
flow in the section (e.g. an if clause on cancel). The wording in the
comment and variable names didn't seem to match where it was actually
branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections
branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to
run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy
block, which is then fixed later once all of the information is
available.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
This is quite ugly but it is the best I could think of. The old
FiniCBWrapper was way too brittle depending upon the exact block
structure inside of the section, and could be confused by any control
flow in the section (e.g. an if clause on cancel). The wording in the
comment and variable names didn't seem to match where it was actually
branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections
branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to
run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy
block, which is then fixed later once all of the information is
available.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
This is quite ugly but it is the best I could think of. The old
FiniCBWrapper was way too brittle depending upon the exact block
structure inside of the section, and could be confused by any control
flow in the section (e.g. an if clause on cancel). The wording in the
comment and variable names didn't seem to match where it was actually
branching too as well.

Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections
branches to a block containing __kmpc_for_static_fini.

This was hard to achieve here because sometimes the FiniCBWrapper has to
run before the worksharing loop finalization has been crated.

To get around this ordering issue I created a dummy branch to a dummy
block, which is then fixed later once all of the information is
available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category flang:openmp mlir:llvm mlir:openmp mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants