[MLIR] Improve KernelOutlining to avoid introducing an extra block #90359

joker-eph · 2024-04-27T21:18:49Z

This fixes a TODO in the code.

llvmbot · 2024-04-27T21:19:20Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-gpu

Author: Mehdi Amini (joker-eph)

Changes

This fixes a TODO in the code.

Full diff: https://github.com/llvm/llvm-project/pull/90359.diff

2 Files Affected:

(modified) mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp (+11-16)
(modified) mlir/test/Dialect/GPU/outlining.mlir (-4)

diff --git a/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp b/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
index 2436113dc4239c..03c59baad0f805 100644
--- a/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
+++ b/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
@@ -241,24 +241,19 @@ static gpu::GPUFuncOp outlineKernelFuncImpl(gpu::LaunchOp launchOp,
     map.map(operand.value(), entryBlock.getArgument(operand.index()));
 
   // Clone the region of the gpu.launch operation into the gpu.func operation.
-  // TODO: If cloneInto can be modified such that if a mapping for
-  // a block exists, that block will be used to clone operations into (at the
-  // end of the block), instead of creating a new block, this would be much
-  // cleaner.
   launchOpBody.cloneInto(&outlinedFuncBody, map);
 
-  // Branch from entry of the gpu.func operation to the block that is cloned
-  // from the entry block of the gpu.launch operation.
-  Block &launchOpEntry = launchOpBody.front();
-  Block *clonedLaunchOpEntry = map.lookup(&launchOpEntry);
-  builder.setInsertionPointToEnd(&entryBlock);
-  builder.create<cf::BranchOp>(loc, clonedLaunchOpEntry);
-
-  outlinedFunc.walk([](gpu::TerminatorOp op) {
-    OpBuilder replacer(op);
-    replacer.create<gpu::ReturnOp>(op.getLoc());
-    op.erase();
-  });
+  // Splice now the entry block of the gpu.launch operation at the end of the
+  // gpu.func entry block and erase the redundant block.
+  Block *clonedLaunchOpEntry = map.lookup(&launchOpBody.front());
+  Operation *terminator = clonedLaunchOpEntry->getTerminator();
+  OpBuilder replacer(terminator);
+  replacer.create<gpu::ReturnOp>(terminator->getLoc());
+  terminator->erase();
+  entryBlock.getOperations().splice(entryBlock.getOperations().end(),
+                                    clonedLaunchOpEntry->getOperations());
+  clonedLaunchOpEntry->erase();
+
   return outlinedFunc;
 }
 
diff --git a/mlir/test/Dialect/GPU/outlining.mlir b/mlir/test/Dialect/GPU/outlining.mlir
index 601add9a9f91c0..4c7de4f1982123 100644
--- a/mlir/test/Dialect/GPU/outlining.mlir
+++ b/mlir/test/Dialect/GPU/outlining.mlir
@@ -54,8 +54,6 @@ func.func @launch() {
 // CHECK-NEXT: %[[BDIM:.*]] = gpu.block_dim x
 // CHECK-NEXT: = gpu.block_dim y
 // CHECK-NEXT: = gpu.block_dim z
-// CHECK-NEXT: cf.br ^[[BLOCK:.*]]
-// CHECK-NEXT: ^[[BLOCK]]:
 // CHECK-NEXT: "use"(%[[KERNEL_ARG0]]) : (f32) -> ()
 // CHECK-NEXT: "some_op"(%[[BID]], %[[BDIM]]) : (index, index) -> ()
 // CHECK-NEXT: = memref.load %[[KERNEL_ARG1]][%[[TID]]] : memref<?xf32, 1>
@@ -475,8 +473,6 @@ func.func @launch_cluster() {
 // CHECK-NEXT: %[[CDIM:.*]] = gpu.cluster_dim x
 // CHECK-NEXT: = gpu.cluster_dim y
 // CHECK-NEXT: = gpu.cluster_dim z
-// CHECK-NEXT: cf.br ^[[BLOCK:.*]]
-// CHECK-NEXT: ^[[BLOCK]]:
 // CHECK-NEXT: "use"(%[[KERNEL_ARG0]]) : (f32) -> ()
 // CHECK-NEXT: "some_op"(%[[CID]], %[[BID]], %[[BDIM]]) : (index, index, index) -> ()
 // CHECK-NEXT: = memref.load %[[KERNEL_ARG1]][%[[TID]]] : memref<?xf32, 1>

mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp

github-actions · 2024-04-28T13:19:22Z

✅ With the latest revision this PR passed the C/C++ code formatter.

fabianmcg

LGTM.

fyi, a test is failing.

This fixes a TODO in the code. Also add a test with a CFG in the region

grypp

Nice, thanks for improving that

joker-eph requested review from grypp and fabianmcg April 27, 2024 21:18

llvmbot added mlir:gpu mlir labels Apr 27, 2024

fabianmcg reviewed Apr 27, 2024

View reviewed changes

mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp Outdated Show resolved Hide resolved

joker-eph force-pushed the kernel-outlining branch from 2b0bf95 to b59a889 Compare April 28, 2024 13:16

joker-eph force-pushed the kernel-outlining branch from b59a889 to 2d03b72 Compare April 28, 2024 13:32

fabianmcg self-requested a review April 28, 2024 13:46

fabianmcg approved these changes Apr 28, 2024

View reviewed changes

[MLIR] Improve KernelOutlining to avoid introducing an extra block

48cc91b

This fixes a TODO in the code. Also add a test with a CFG in the region

joker-eph force-pushed the kernel-outlining branch from 2d03b72 to 48cc91b Compare April 28, 2024 15:22

grypp approved these changes Apr 29, 2024

View reviewed changes

joker-eph merged commit d566a5c into llvm:main Apr 29, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MLIR] Improve KernelOutlining to avoid introducing an extra block #90359

[MLIR] Improve KernelOutlining to avoid introducing an extra block #90359

Uh oh!

joker-eph commented Apr 27, 2024

Uh oh!

llvmbot commented Apr 27, 2024 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Apr 28, 2024 •

edited

Loading

Uh oh!

fabianmcg left a comment

Uh oh!

grypp left a comment

Uh oh!

Uh oh!

Uh oh!

[MLIR] Improve KernelOutlining to avoid introducing an extra block #90359

[MLIR] Improve KernelOutlining to avoid introducing an extra block #90359

Uh oh!

Conversation

joker-eph commented Apr 27, 2024

Uh oh!

llvmbot commented Apr 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fabianmcg left a comment

Choose a reason for hiding this comment

Uh oh!

grypp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Apr 27, 2024 •

edited

Loading

github-actions bot commented Apr 28, 2024 •

edited

Loading