Skip to content

[mlir][gpu] Add an offloading handler attribute to gpu.module #78047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 15, 2024

Conversation

fabianmcg
Copy link
Contributor

This patch adds an optional offloading handler attribute to thegpu.module op. This attribute will be used during gpu-module-to-binary pass to override the offloading handler used in the gpu.binary op.

This patch adds an optional offloading handler attribute to the`gpu.module` op.
This attribute will be used during `gpu-module-to-binary` pass to override the
offloading handler used in the `gpu.binary` op.
@fabianmcg fabianmcg requested review from grypp, qcolombet and antiagainst and removed request for qcolombet January 13, 2024 16:28
@fabianmcg fabianmcg marked this pull request as ready for review January 13, 2024 16:28
@llvmbot
Copy link
Member

llvmbot commented Jan 13, 2024

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-gpu

Author: Fabian Mora (fabianmcg)

Changes

This patch adds an optional offloading handler attribute to thegpu.module op. This attribute will be used during gpu-module-to-binary pass to override the offloading handler used in the gpu.binary op.


Full diff: https://github.com/llvm/llvm-project/pull/78047.diff

6 Files Affected:

  • (modified) mlir/include/mlir/Dialect/GPU/IR/GPUOps.td (+17-4)
  • (modified) mlir/lib/Dialect/GPU/IR/GPUDialect.cpp (+30-8)
  • (modified) mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp (+5)
  • (modified) mlir/test/Dialect/GPU/invalid.mlir (+7)
  • (modified) mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir (+12)
  • (modified) mlir/test/Dialect/GPU/ops.mlir (+3)
diff --git a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
index 8d4a110ee801f0..228aad3d84629c 100644
--- a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
+++ b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
@@ -1191,7 +1191,9 @@ def GPU_BarrierOp : GPU_Op<"barrier"> {
 def GPU_GPUModuleOp : GPU_Op<"module", [
       DataLayoutOpInterface, HasDefaultDLTIDataLayout, IsolatedFromAbove,
       SymbolTable, Symbol, SingleBlockImplicitTerminator<"ModuleEndOp">
-    ]>, Arguments<(ins OptionalAttr<GPUNonEmptyTargetArrayAttr>:$targets)> {
+    ]>, Arguments<(ins
+          OptionalAttr<GPUNonEmptyTargetArrayAttr>:$targets,
+          OptionalAttr<OffloadingTranslationAttr>:$offloadingHandler)> {
   let summary = "A top level compilation unit containing code to be run on a GPU.";
   let description = [{
     GPU module contains code that is intended to be run on a GPU. A host device
@@ -1212,13 +1214,20 @@ def GPU_GPUModuleOp : GPU_Op<"module", [
     how to transform modules into binary strings and are used by the
     `gpu-module-to-binary` pass to transform modules into GPU binaries.
 
+    Modules can contain an optional `OffloadingTranslationAttr` attribute. This
+    attribute will be used during the `gpu-module-to-binary` pass to specify the
+    `OffloadingTranslationAttr` used when creating the `gpu.binary` operation.
+
     ```
     gpu.module @symbol_name {
       gpu.func {}
         ...
       gpu.module_end
     }
-    gpu.module @symbol_name2 [#nvvm.target, #rocdl.target<chip = "gfx90a">] {
+    // Module with offloading handler and target attributes.
+    gpu.module @symbol_name2 <#gpu.select_object<1>> [
+        #nvvm.target,
+        #rocdl.target<chip = "gfx90a">] {
       gpu.func {}
         ...
       gpu.module_end
@@ -1226,8 +1235,12 @@ def GPU_GPUModuleOp : GPU_Op<"module", [
     ```
   }];
   let builders = [
-    OpBuilder<(ins "StringRef":$name, CArg<"ArrayAttr", "{}">:$targets)>,
-    OpBuilder<(ins "StringRef":$name, "ArrayRef<Attribute>":$targets)>
+    OpBuilder<(ins "StringRef":$name,
+                   CArg<"ArrayAttr", "{}">:$targets,
+                   CArg<"Attribute", "{}">:$handler)>,
+    OpBuilder<(ins "StringRef":$name,
+                   "ArrayRef<Attribute>":$targets,
+                   CArg<"Attribute", "{}">:$handler)>
   ];
   let regions = (region SizedRegion<1>:$bodyRegion);
   let hasCustomAssemblyFormat = 1;
diff --git a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
index 020900934c9f72..514b3e9a6e8a56 100644
--- a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+++ b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
@@ -1724,19 +1724,24 @@ LogicalResult gpu::ReturnOp::verify() {
 //===----------------------------------------------------------------------===//
 
 void GPUModuleOp::build(OpBuilder &builder, OperationState &result,
-                        StringRef name, ArrayAttr targets) {
+                        StringRef name, ArrayAttr targets,
+                        Attribute offloadingHandler) {
   ensureTerminator(*result.addRegion(), builder, result.location);
   result.attributes.push_back(builder.getNamedAttr(
       ::mlir::SymbolTable::getSymbolAttrName(), builder.getStringAttr(name)));
 
+  Properties &props = result.getOrAddProperties<Properties>();
   if (targets)
-    result.getOrAddProperties<Properties>().targets = targets;
+    props.targets = targets;
+  props.offloadingHandler = offloadingHandler;
 }
 
 void GPUModuleOp::build(OpBuilder &builder, OperationState &result,
-                        StringRef name, ArrayRef<Attribute> targets) {
+                        StringRef name, ArrayRef<Attribute> targets,
+                        Attribute offloadingHandler) {
   build(builder, result, name,
-        targets.empty() ? ArrayAttr() : builder.getArrayAttr(targets));
+        targets.empty() ? ArrayAttr() : builder.getArrayAttr(targets),
+        offloadingHandler);
 }
 
 ParseResult GPUModuleOp::parse(OpAsmParser &parser, OperationState &result) {
@@ -1747,6 +1752,16 @@ ParseResult GPUModuleOp::parse(OpAsmParser &parser, OperationState &result) {
                              result.attributes))
     return failure();
 
+  Properties &props = result.getOrAddProperties<Properties>();
+
+  // Parse the optional offloadingHandler
+  if (succeeded(parser.parseOptionalLess())) {
+    if (parser.parseAttribute(props.offloadingHandler))
+      return failure();
+    if (parser.parseGreater())
+      return failure();
+  }
+
   // Parse the optional array of target attributes.
   OptionalParseResult targetsAttrResult =
       parser.parseOptionalAttribute(targetsAttr, Type{});
@@ -1754,7 +1769,7 @@ ParseResult GPUModuleOp::parse(OpAsmParser &parser, OperationState &result) {
     if (failed(*targetsAttrResult)) {
       return failure();
     }
-    result.getOrAddProperties<Properties>().targets = targetsAttr;
+    props.targets = targetsAttr;
   }
 
   // If module attributes are present, parse them.
@@ -1775,15 +1790,22 @@ void GPUModuleOp::print(OpAsmPrinter &p) {
   p << ' ';
   p.printSymbolName(getName());
 
+  if (Attribute attr = getOffloadingHandlerAttr()) {
+    p << " <";
+    p.printAttribute(attr);
+    p << ">";
+  }
+
   if (Attribute attr = getTargetsAttr()) {
     p << ' ';
     p.printAttribute(attr);
     p << ' ';
   }
 
-  p.printOptionalAttrDictWithKeyword(
-      (*this)->getAttrs(),
-      {mlir::SymbolTable::getSymbolAttrName(), getTargetsAttrName()});
+  p.printOptionalAttrDictWithKeyword((*this)->getAttrs(),
+                                     {mlir::SymbolTable::getSymbolAttrName(),
+                                      getTargetsAttrName(),
+                                      getOffloadingHandlerAttrName()});
   p << ' ';
   p.printRegion(getRegion(), /*printEntryBlockArgs=*/false,
                 /*printBlockTerminators=*/false);
diff --git a/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp b/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
index 70d36297e103f3..0527073da85b69 100644
--- a/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
+++ b/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
@@ -124,6 +124,11 @@ LogicalResult moduleSerializer(GPUModuleOp op,
     }
     objects.push_back(object);
   }
+  if (auto moduleHandler =
+          dyn_cast_or_null<OffloadingLLVMTranslationAttrInterface>(
+              op.getOffloadingHandlerAttr());
+      !handler && moduleHandler)
+    handler = moduleHandler;
   builder.setInsertionPointAfter(op);
   builder.create<gpu::BinaryOp>(op.getLoc(), op.getName(), handler,
                                 builder.getArrayAttr(objects));
diff --git a/mlir/test/Dialect/GPU/invalid.mlir b/mlir/test/Dialect/GPU/invalid.mlir
index 4d3a898fdd1565..273bc282b0b3b0 100644
--- a/mlir/test/Dialect/GPU/invalid.mlir
+++ b/mlir/test/Dialect/GPU/invalid.mlir
@@ -818,3 +818,10 @@ func.func @main(%arg0 : index) {
   return
 }
 
+// -----
+
+module attributes {gpu.container_module} {
+  // expected-error@+1 {{expected attribute value}}
+  gpu.module @kernel <> {
+  }
+}
diff --git a/mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir b/mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir
index 05e368f7a642e6..c286c8bc9042ff 100644
--- a/mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir
+++ b/mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir
@@ -22,4 +22,16 @@ module attributes {gpu.container_module} {
       llvm.return
     }
   }
+
+  // CHECK-LABEL:gpu.binary @kernel_module3 <#gpu.select_object<1 : i64>>
+  // CHECK:[#gpu.object<#nvvm.target<chip = "sm_70">, offload = "{{.*}}">, #gpu.object<#nvvm.target<chip = "sm_80">, offload = "{{.*}}">]
+  gpu.module @kernel_module3 <#gpu.select_object<1>> [
+      #nvvm.target<chip = "sm_70">,
+      #nvvm.target<chip = "sm_80">] {
+    llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr,
+        %arg2: !llvm.ptr, %arg3: i64, %arg4: i64,
+        %arg5: i64) attributes {gpu.kernel} {
+      llvm.return
+    }
+  }
 }
diff --git a/mlir/test/Dialect/GPU/ops.mlir b/mlir/test/Dialect/GPU/ops.mlir
index 60512424383052..488fa7aaf6adca 100644
--- a/mlir/test/Dialect/GPU/ops.mlir
+++ b/mlir/test/Dialect/GPU/ops.mlir
@@ -412,3 +412,6 @@ gpu.module @module_with_two_target [#nvvm.target, #rocdl.target<chip = "gfx90a">
     gpu.return
   }
 }
+
+gpu.module @module_with_offload_handler <#gpu.select_object<0>> [#nvvm.target] {
+}

@grypp
Copy link
Member

grypp commented Jan 15, 2024

Code looks well-organized. Just to confirm, does gpu-module-to-binary serialize only selected objects in your examples?

We don't have other example than #gpu.select_object right?

@fabianmcg
Copy link
Contributor Author

We don't have other example than #gpu.select_object right?

You're correct, in upstream we don't have more examples.

However, I'm working on adding another one in #78117 . That new one allows the usage of the CUDA RT to launch kernels as well as start adding support for OMP target offload compilation with GPU.

Copy link
Member

@grypp grypp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for explanation.

@fabianmcg fabianmcg merged commit 5b4f2b9 into llvm:main Jan 15, 2024
@fabianmcg fabianmcg deleted the gpu-module-handler branch January 15, 2024 22:06
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…#78047)

This patch adds an optional offloading handler attribute to
the`gpu.module` op. This attribute will be used during
`gpu-module-to-binary` pass to override the offloading handler used in
the `gpu.binary` op.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants