-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[mlir][OpenMP] Convert omp.cancel sections to LLVMIR #137193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-flang-openmp Author: Tom Eccles (tblah) ChangesThis is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available. Full diff: https://github.com/llvm/llvm-project/pull/137193.diff 4 Files Affected:
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index be05f01c94603..3f19088e6c73d 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -2172,6 +2172,9 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
if (!updateToLocation(Loc))
return Loc.IP;
+ // FiniCBWrapper needs to create a branch to the loop finalization block, but
+ // this has not been created yet at some times when this callback runs.
+ SmallVector<BranchInst *> CancellationBranches;
auto FiniCBWrapper = [&](InsertPointTy IP) {
if (IP.getBlock()->end() != IP.getPoint())
return FiniCB(IP);
@@ -2179,16 +2182,9 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
// will fail because that function requires the Finalization Basic Block to
// have a terminator, which is already removed by EmitOMPRegionBody.
// IP is currently at cancelation block.
- // We need to backtrack to the condition block to fetch
- // the exit block and create a branch from cancelation
- // to exit block.
- IRBuilder<>::InsertPointGuard IPG(Builder);
- Builder.restoreIP(IP);
- auto *CaseBB = IP.getBlock()->getSinglePredecessor();
- auto *CondBB = CaseBB->getSinglePredecessor()->getSinglePredecessor();
- auto *ExitBB = CondBB->getTerminator()->getSuccessor(1);
- Instruction *I = Builder.CreateBr(ExitBB);
- IP = InsertPointTy(I->getParent(), I->getIterator());
+ BranchInst *DummyBranch = Builder.CreateBr(IP.getBlock());
+ IP = InsertPointTy(DummyBranch->getParent(), DummyBranch->getIterator());
+ CancellationBranches.push_back(DummyBranch);
return FiniCB(IP);
};
@@ -2251,6 +2247,9 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
return WsloopIP.takeError();
InsertPointTy AfterIP = *WsloopIP;
+ BasicBlock *LoopFini = AfterIP.getBlock()->getSinglePredecessor();
+ assert(LoopFini && "Bad structure of static workshare loop finalization");
+
// Apply the finalization callback in LoopAfterBB
auto FiniInfo = FinalizationStack.pop_back_val();
assert(FiniInfo.DK == OMPD_sections &&
@@ -2264,6 +2263,14 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections(
AfterIP = {FiniBB, FiniBB->begin()};
}
+ // Now we can fix the dummy branch to point to the right place
+ if (!CancellationBranches.empty()) {
+ for (BranchInst *DummyBranch : CancellationBranches) {
+ assert(DummyBranch->getNumSuccessors() == 1);
+ DummyBranch->setSuccessor(0, LoopFini);
+ }
+ }
+
return AfterIP;
}
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 6185a433a8199..d1885641f389d 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -161,7 +161,8 @@ static LogicalResult checkImplementationStatus(Operation &op) {
auto checkCancelDirective = [&todo](auto op, LogicalResult &result) {
omp::ClauseCancellationConstructType cancelledDirective =
op.getCancelDirective();
- if (cancelledDirective != omp::ClauseCancellationConstructType::Parallel)
+ if (cancelledDirective != omp::ClauseCancellationConstructType::Parallel &&
+ cancelledDirective != omp::ClauseCancellationConstructType::Sections)
result = todo("cancel directive");
};
auto checkDepend = [&todo](auto op, LogicalResult &result) {
@@ -1690,10 +1691,11 @@ convertOmpSections(Operation &opInst, llvm::IRBuilderBase &builder,
auto finiCB = [&](InsertPointTy codeGenIP) { return llvm::Error::success(); };
allocaIP = findAllocaInsertPoint(builder, moduleTranslation);
+ bool isCancellable = constructIsCancellable(sectionsOp);
llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterIP =
moduleTranslation.getOpenMPBuilder()->createSections(
- ompLoc, allocaIP, sectionCBs, privCB, finiCB, false,
+ ompLoc, allocaIP, sectionCBs, privCB, finiCB, isCancellable,
sectionsOp.getNowait());
if (failed(handleError(afterIP, opInst)))
diff --git a/mlir/test/Target/LLVMIR/openmp-cancel.mlir b/mlir/test/Target/LLVMIR/openmp-cancel.mlir
index 1f67d6ceb34af..fca16b936fc85 100644
--- a/mlir/test/Target/LLVMIR/openmp-cancel.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-cancel.mlir
@@ -80,3 +80,79 @@ llvm.func @cancel_parallel_if(%arg0 : i1) {
// CHECK: br label %[[VAL_23]]
// CHECK: omp.par.exit.exitStub: ; preds = %[[VAL_31]], %[[VAL_26]]
// CHECK: ret void
+
+llvm.func @cancel_sections_if(%cond : i1) {
+ omp.sections {
+ omp.section {
+ omp.cancel cancellation_construct_type(sections) if(%cond)
+ omp.terminator
+ }
+ omp.terminator
+ }
+ llvm.return
+}
+// CHECK-LABEL: define void @cancel_sections_if
+// CHECK: %[[VAL_0:.*]] = alloca i32, align 4
+// CHECK: %[[VAL_1:.*]] = alloca i32, align 4
+// CHECK: %[[VAL_2:.*]] = alloca i32, align 4
+// CHECK: %[[VAL_3:.*]] = alloca i32, align 4
+// CHECK: br label %[[VAL_4:.*]]
+// CHECK: entry: ; preds = %[[VAL_5:.*]]
+// CHECK: br label %[[VAL_6:.*]]
+// CHECK: omp_section_loop.preheader: ; preds = %[[VAL_4]]
+// CHECK: store i32 0, ptr %[[VAL_1]], align 4
+// CHECK: store i32 0, ptr %[[VAL_2]], align 4
+// CHECK: store i32 1, ptr %[[VAL_3]], align 4
+// CHECK: %[[VAL_7:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
+// CHECK: call void @__kmpc_for_static_init_4u(ptr @1, i32 %[[VAL_7]], i32 34, ptr %[[VAL_0]], ptr %[[VAL_1]], ptr %[[VAL_2]], ptr %[[VAL_3]], i32 1, i32 0)
+// CHECK: %[[VAL_8:.*]] = load i32, ptr %[[VAL_1]], align 4
+// CHECK: %[[VAL_9:.*]] = load i32, ptr %[[VAL_2]], align 4
+// CHECK: %[[VAL_10:.*]] = sub i32 %[[VAL_9]], %[[VAL_8]]
+// CHECK: %[[VAL_11:.*]] = add i32 %[[VAL_10]], 1
+// CHECK: br label %[[VAL_12:.*]]
+// CHECK: omp_section_loop.header: ; preds = %[[VAL_13:.*]], %[[VAL_6]]
+// CHECK: %[[VAL_14:.*]] = phi i32 [ 0, %[[VAL_6]] ], [ %[[VAL_15:.*]], %[[VAL_13]] ]
+// CHECK: br label %[[VAL_16:.*]]
+// CHECK: omp_section_loop.cond: ; preds = %[[VAL_12]]
+// CHECK: %[[VAL_17:.*]] = icmp ult i32 %[[VAL_14]], %[[VAL_11]]
+// CHECK: br i1 %[[VAL_17]], label %[[VAL_18:.*]], label %[[VAL_19:.*]]
+// CHECK: omp_section_loop.body: ; preds = %[[VAL_16]]
+// CHECK: %[[VAL_20:.*]] = add i32 %[[VAL_14]], %[[VAL_8]]
+// CHECK: %[[VAL_21:.*]] = mul i32 %[[VAL_20]], 1
+// CHECK: %[[VAL_22:.*]] = add i32 %[[VAL_21]], 0
+// CHECK: switch i32 %[[VAL_22]], label %[[VAL_23:.*]] [
+// CHECK: i32 0, label %[[VAL_24:.*]]
+// CHECK: ]
+// CHECK: omp_section_loop.body.case: ; preds = %[[VAL_18]]
+// CHECK: br label %[[VAL_25:.*]]
+// CHECK: omp.section.region: ; preds = %[[VAL_24]]
+// CHECK: br i1 %[[VAL_26:.*]], label %[[VAL_27:.*]], label %[[VAL_28:.*]]
+// CHECK: 9: ; preds = %[[VAL_25]]
+// CHECK: %[[VAL_29:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
+// CHECK: %[[VAL_30:.*]] = call i32 @__kmpc_cancel(ptr @1, i32 %[[VAL_29]], i32 3)
+// CHECK: %[[VAL_31:.*]] = icmp eq i32 %[[VAL_30]], 0
+// CHECK: br i1 %[[VAL_31]], label %[[VAL_32:.*]], label %[[VAL_33:.*]]
+// CHECK: .split: ; preds = %[[VAL_27]]
+// CHECK: br label %[[VAL_34:.*]]
+// CHECK: 12: ; preds = %[[VAL_25]]
+// CHECK: br label %[[VAL_34]]
+// CHECK: 13: ; preds = %[[VAL_28]], %[[VAL_32]]
+// CHECK: br label %[[VAL_35:.*]]
+// CHECK: omp.region.cont: ; preds = %[[VAL_34]]
+// CHECK: br label %[[VAL_23]]
+// CHECK: omp_section_loop.body.sections.after: ; preds = %[[VAL_35]], %[[VAL_18]]
+// CHECK: br label %[[VAL_13]]
+// CHECK: omp_section_loop.inc: ; preds = %[[VAL_23]]
+// CHECK: %[[VAL_15]] = add nuw i32 %[[VAL_14]], 1
+// CHECK: br label %[[VAL_12]]
+// CHECK: omp_section_loop.exit: ; preds = %[[VAL_33]], %[[VAL_16]]
+// CHECK: call void @__kmpc_for_static_fini(ptr @1, i32 %[[VAL_7]])
+// CHECK: %[[VAL_36:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
+// CHECK: call void @__kmpc_barrier(ptr @2, i32 %[[VAL_36]])
+// CHECK: br label %[[VAL_37:.*]]
+// CHECK: omp_section_loop.after: ; preds = %[[VAL_19]]
+// CHECK: br label %[[VAL_38:.*]]
+// CHECK: omp_section_loop.aftersections.fini: ; preds = %[[VAL_37]]
+// CHECK: ret void
+// CHECK: .cncl: ; preds = %[[VAL_27]]
+// CHECK: br label %[[VAL_19]]
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index bf251ac2b7d0a..6f904d0647285 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -42,22 +42,6 @@ llvm.func @cancel_wsloop(%lb : i32, %ub : i32, %step: i32) {
// -----
-llvm.func @cancel_sections() {
- // expected-error@below {{LLVM Translation failed for operation: omp.sections}}
- omp.sections {
- omp.section {
- // expected-error@below {{not yet implemented: Unhandled clause cancel directive in omp.cancel operation}}
- // expected-error@below {{LLVM Translation failed for operation: omp.cancel}}
- omp.cancel cancellation_construct_type(sections)
- omp.terminator
- }
- omp.terminator
- }
- llvm.return
-}
-
-// -----
-
llvm.func @cancel_taskgroup() {
// expected-error@below {{LLVM Translation failed for operation: omp.taskgroup}}
omp.taskgroup {
|
PR Stack:
|
aa2445b
to
4d28bd8
Compare
3a13116
to
4fb5d42
Compare
4d28bd8
to
f3b8eeb
Compare
This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.
This conflicted with my hack doing the same thing: leading to a use-after-free of the old terminator. I think it is more helpful to do this in OpenMPIRBuilder than replicating the hack in callbacks in both clang and mlir.
60592cf
to
6c678b7
Compare
It turned out that the Windows failures were due to some UB in my patch. Clang (with the OMPIRBuilder backend) was actually doing almost exactly the same hack as I implement in this patch, which then interacted badly by my attempts to do the same thing in OMPIRBuilder. I have removed the clang code because I think it is better to have the hack in one place in OMPIRBuilder instead of in two different callbacks. This should be ready for review again. |
The test failures look unrelated to my patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -2264,6 +2263,14 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections( | |||
AfterIP = {FiniBB, FiniBB->begin()}; | |||
} | |||
|
|||
// Now we can fix the dummy branch to point to the right place | |||
if (!CancellationBranches.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Condition is unnecessary: If CancellationBranches
is empty, the loop will not execute anyway
This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.
This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.
This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.
This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.
This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.
This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well.
Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini.
This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated.
To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.