-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[MLIR] Improve KernelOutlining to avoid introducing an extra block #90359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-gpu Author: Mehdi Amini (joker-eph) ChangesThis fixes a TODO in the code. Full diff: https://github.com/llvm/llvm-project/pull/90359.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp b/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
index 2436113dc4239c..03c59baad0f805 100644
--- a/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
+++ b/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
@@ -241,24 +241,19 @@ static gpu::GPUFuncOp outlineKernelFuncImpl(gpu::LaunchOp launchOp,
map.map(operand.value(), entryBlock.getArgument(operand.index()));
// Clone the region of the gpu.launch operation into the gpu.func operation.
- // TODO: If cloneInto can be modified such that if a mapping for
- // a block exists, that block will be used to clone operations into (at the
- // end of the block), instead of creating a new block, this would be much
- // cleaner.
launchOpBody.cloneInto(&outlinedFuncBody, map);
- // Branch from entry of the gpu.func operation to the block that is cloned
- // from the entry block of the gpu.launch operation.
- Block &launchOpEntry = launchOpBody.front();
- Block *clonedLaunchOpEntry = map.lookup(&launchOpEntry);
- builder.setInsertionPointToEnd(&entryBlock);
- builder.create<cf::BranchOp>(loc, clonedLaunchOpEntry);
-
- outlinedFunc.walk([](gpu::TerminatorOp op) {
- OpBuilder replacer(op);
- replacer.create<gpu::ReturnOp>(op.getLoc());
- op.erase();
- });
+ // Splice now the entry block of the gpu.launch operation at the end of the
+ // gpu.func entry block and erase the redundant block.
+ Block *clonedLaunchOpEntry = map.lookup(&launchOpBody.front());
+ Operation *terminator = clonedLaunchOpEntry->getTerminator();
+ OpBuilder replacer(terminator);
+ replacer.create<gpu::ReturnOp>(terminator->getLoc());
+ terminator->erase();
+ entryBlock.getOperations().splice(entryBlock.getOperations().end(),
+ clonedLaunchOpEntry->getOperations());
+ clonedLaunchOpEntry->erase();
+
return outlinedFunc;
}
diff --git a/mlir/test/Dialect/GPU/outlining.mlir b/mlir/test/Dialect/GPU/outlining.mlir
index 601add9a9f91c0..4c7de4f1982123 100644
--- a/mlir/test/Dialect/GPU/outlining.mlir
+++ b/mlir/test/Dialect/GPU/outlining.mlir
@@ -54,8 +54,6 @@ func.func @launch() {
// CHECK-NEXT: %[[BDIM:.*]] = gpu.block_dim x
// CHECK-NEXT: = gpu.block_dim y
// CHECK-NEXT: = gpu.block_dim z
-// CHECK-NEXT: cf.br ^[[BLOCK:.*]]
-// CHECK-NEXT: ^[[BLOCK]]:
// CHECK-NEXT: "use"(%[[KERNEL_ARG0]]) : (f32) -> ()
// CHECK-NEXT: "some_op"(%[[BID]], %[[BDIM]]) : (index, index) -> ()
// CHECK-NEXT: = memref.load %[[KERNEL_ARG1]][%[[TID]]] : memref<?xf32, 1>
@@ -475,8 +473,6 @@ func.func @launch_cluster() {
// CHECK-NEXT: %[[CDIM:.*]] = gpu.cluster_dim x
// CHECK-NEXT: = gpu.cluster_dim y
// CHECK-NEXT: = gpu.cluster_dim z
-// CHECK-NEXT: cf.br ^[[BLOCK:.*]]
-// CHECK-NEXT: ^[[BLOCK]]:
// CHECK-NEXT: "use"(%[[KERNEL_ARG0]]) : (f32) -> ()
// CHECK-NEXT: "some_op"(%[[CID]], %[[BID]], %[[BDIM]]) : (index, index, index) -> ()
// CHECK-NEXT: = memref.load %[[KERNEL_ARG1]][%[[TID]]] : memref<?xf32, 1>
|
2b0bf95
to
b59a889
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
b59a889
to
2d03b72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
fyi, a test is failing.
This fixes a TODO in the code. Also add a test with a CFG in the region
2d03b72
to
48cc91b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for improving that
This fixes a TODO in the code.