Skip to content

[MLIR][Conversion] Add new option "use-opencl" for "convert-gpu-to-spirv" conversion pass #66445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

silee2
Copy link
Contributor

@silee2 silee2 commented Sep 14, 2023

Current convert-gpu-to-spirv pass has memory space to storage class map hardcoded as spirv::mapMemorySpaceToVulkanStorageClass A new option use-opencl is added to control changing that value to spirv::mapMemorySpaceToOpenCLStorageClass

use-opencl's default value is false and
spirv::mapMemorySpaceToVulkanStorageClass is set as the default memory space to storage class map

when use-opencl is set to true,
spirv::mapMemorySpaceToOpenCLStorageClass is used instead

"convert-gpu-to-spirv" conversion pass

Current convert-gpu-to-spirv pass has memory space to storage class map
hardcoded as spirv::mapMemorySpaceToVulkanStorageClass
A new option use-opencl is added to control changing that value to
spirv::mapMemorySpaceToOpenCLStorageClass

use-opencl's default value is false and
spirv::mapMemorySpaceToVulkanStorageClass is set as the default
memory space to storage class map

when use-opencl is set to true,
spirv::mapMemorySpaceToOpenCLStorageClass is used instead
@silee2
Copy link
Contributor Author

silee2 commented Sep 14, 2023

The PR is one part of breaking down #65539 into smaller ones

Comment on lines +59 to +65
// CHECK: %[[OFFSET1_0:.*]] = spirv.Constant 0 : i64
// CHECK: %[[STRIDE1_1:.*]] = spirv.Constant 4 : i64
// CHECK: %[[UPDATE1_1:.*]] = spirv.IMul %[[STRIDE1_1]], %[[INDEX1]] : i64
// CHECK: %[[OFFSET1_1:.*]] = spirv.IAdd %[[OFFSET1_0]], %[[UPDATE1_1]] : i64
// CHECK: %[[STRIDE1_2:.*]] = spirv.Constant 1 : i64
// CHECK: %[[UPDATE1_2:.*]] = spirv.IMul %[[STRIDE1_2]], %[[INDEX2]] : i64
// CHECK: %[[OFFSET1_2:.*]] = spirv.IAdd %[[OFFSET1_1]], %[[UPDATE1_2]] : i64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about the exact offset? This check sequence seems verbose and maybe even fragile (assumes exact order of constants and may be broken by unrelated changes in the lowering patterns).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kuhar This is the same example as load-store.mlir in the same folder testing Vulkan lowering. I agree that the code sequence is fragile. @antiagainst What do you think? do you think a separate test for checking code sequence generated for OpenCL kernels is need? If not, I can replace this test with a smaller test that just checks Crossworkgroup attribute handling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah would be nice to avoid the excessive check here. Here the goal is more about testing the storage class mapping; the check for indexing and such should be in MemRefToSPIRV. (The Vulkan side was added way early before things are properly structured so you see it's not exactly following that.)

Comment on lines +71 to +76
%15 = memref.load %arg1[%12, %13] : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>
// CHECK: %[[VAL3:.*]] = spirv.FAdd %[[VAL1]], %[[VAL2]] : f32
%16 = arith.addf %14, %15 : f32
// CHECK: %[[PTR3:.*]] = spirv.AccessChain %[[ARG2]]{{\[}}{{%.*}}{{\]}}
// CHECK: spirv.Store "CrossWorkgroup" %[[PTR3]], %[[VAL3]] : f32
memref.store %16, %arg2[%12, %13] : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we split it up into multiple smaller test cases, similar to those in https://github.com/llvm/llvm-project/blob/main/mlir/test/Conversion/GPUToSPIRV/wmma-ops-to-spirv-nv-coop-matrix.mlir?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to small dedicated tests.

"Use 64-bit integers to convert index types">,
Option<"useOpenCL", "use-opencl",
"bool", /*default=*/"false",
"Use OpenCL instead of Vulkan">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make it clear about what it means? Like "Use memory space mappings for OpenCL instead of the default Vulkan".

Comment on lines +59 to +65
// CHECK: %[[OFFSET1_0:.*]] = spirv.Constant 0 : i64
// CHECK: %[[STRIDE1_1:.*]] = spirv.Constant 4 : i64
// CHECK: %[[UPDATE1_1:.*]] = spirv.IMul %[[STRIDE1_1]], %[[INDEX1]] : i64
// CHECK: %[[OFFSET1_1:.*]] = spirv.IAdd %[[OFFSET1_0]], %[[UPDATE1_1]] : i64
// CHECK: %[[STRIDE1_2:.*]] = spirv.Constant 1 : i64
// CHECK: %[[UPDATE1_2:.*]] = spirv.IMul %[[STRIDE1_2]], %[[INDEX2]] : i64
// CHECK: %[[OFFSET1_2:.*]] = spirv.IAdd %[[OFFSET1_1]], %[[UPDATE1_2]] : i64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah would be nice to avoid the excessive check here. Here the goal is more about testing the storage class mapping; the check for indexing and such should be in MemRefToSPIRV. (The Vulkan side was added way early before things are properly structured so you see it's not exactly following that.)

spirv.target_env = #spirv.target_env<
#spirv.vce<v1.0, [Kernel, Int64, Addresses], []>, api=OpenCL, #spirv.resource_limits<>>
} {
func.func @load_store(%arg0: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg1: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg2: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this use GPU memory space attributes or numeric numbers as input so we can check the conversion is really going down the OpenCL mapping?

Comment on lines +71 to +76
%15 = memref.load %arg1[%12, %13] : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>
// CHECK: %[[VAL3:.*]] = spirv.FAdd %[[VAL1]], %[[VAL2]] : f32
%16 = arith.addf %14, %15 : f32
// CHECK: %[[PTR3:.*]] = spirv.AccessChain %[[ARG2]]{{\[}}{{%.*}}{{\]}}
// CHECK: spirv.Store "CrossWorkgroup" %[[PTR3]], %[[VAL3]] : f32
memref.store %16, %arg2[%12, %13] : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to small dedicated tests.

@llvmbot
Copy link
Member

llvmbot commented Sep 26, 2023

@llvm/pr-subscribers-mlir-gpu
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-spirv

Changes

Current convert-gpu-to-spirv pass has memory space to storage class map hardcoded as spirv::mapMemorySpaceToVulkanStorageClass A new option use-opencl is added to control changing that value to spirv::mapMemorySpaceToOpenCLStorageClass

use-opencl's default value is false and
spirv::mapMemorySpaceToVulkanStorageClass is set as the default memory space to storage class map

when use-opencl is set to true,
spirv::mapMemorySpaceToOpenCLStorageClass is used instead


Full diff: https://github.com/llvm/llvm-project/pull/66445.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Conversion/Passes.td (+4-1)
  • (modified) mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp (+2-1)
  • (added) mlir/test/Conversion/GPUToSPIRV/load-store-opencl.mlir (+81)
diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td
index 3218760931b8cb0..03cf905aaee210c 100644
--- a/mlir/include/mlir/Conversion/Passes.td
+++ b/mlir/include/mlir/Conversion/Passes.td
@@ -567,7 +567,10 @@ def ConvertGPUToSPIRV : Pass<"convert-gpu-to-spirv", "ModuleOp"> {
   let options = [
     Option<"use64bitIndex", "use-64bit-index",
            "bool", /*default=*/"false",
-           "Use 64-bit integers to convert index types">
+           "Use 64-bit integers to convert index types">,
+    Option<"useOpenCL", "use-opencl",
+           "bool", /*default=*/"false",
+           "Use OpenCL instead of Vulkan">
   ];
 }
 
diff --git a/mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp b/mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp
index d0ce58597f980d4..cce8bc4f5f97480 100644
--- a/mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp
+++ b/mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp
@@ -69,7 +69,8 @@ void GPUToSPIRVPass::runOnOperation() {
       std::unique_ptr<ConversionTarget> target =
           spirv::getMemorySpaceToStorageClassTarget(*context);
       spirv::MemorySpaceToStorageClassMap memorySpaceMap =
-          spirv::mapMemorySpaceToVulkanStorageClass;
+          this->useOpenCL ? spirv::mapMemorySpaceToOpenCLStorageClass :
+              spirv::mapMemorySpaceToVulkanStorageClass;
       spirv::MemorySpaceToStorageClassConverter converter(memorySpaceMap);
 
       RewritePatternSet patterns(context);
diff --git a/mlir/test/Conversion/GPUToSPIRV/load-store-opencl.mlir b/mlir/test/Conversion/GPUToSPIRV/load-store-opencl.mlir
new file mode 100644
index 000000000000000..8cbb47d397e89a1
--- /dev/null
+++ b/mlir/test/Conversion/GPUToSPIRV/load-store-opencl.mlir
@@ -0,0 +1,81 @@
+// RUN: mlir-opt -convert-gpu-to-spirv='use-64bit-index=true use-opencl=true' %s -o - | FileCheck %s
+
+module attributes {
+  gpu.container_module,
+  spirv.target_env = #spirv.target_env<
+    #spirv.vce<v1.0, [Kernel, Int64, Addresses], []>, api=OpenCL, #spirv.resource_limits<>>
+} {
+  func.func @load_store(%arg0: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg1: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg2: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>) {
+    %c0 = arith.constant 0 : index
+    %c12 = arith.constant 12 : index
+    %0 = arith.subi %c12, %c0 : index
+    %c1 = arith.constant 1 : index
+    %c0_0 = arith.constant 0 : index
+    %c4 = arith.constant 4 : index
+    %1 = arith.subi %c4, %c0_0 : index
+    %c1_1 = arith.constant 1 : index
+    %c1_2 = arith.constant 1 : index
+    gpu.launch_func @kernels::@load_store_kernel
+        blocks in (%0, %c1_2, %c1_2) threads in (%1, %c1_2, %c1_2)
+        args(%arg0 : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg1 : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg2 : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>,
+             %c0 : index, %c0_0 : index, %c1 : index, %c1_1 : index)
+    return
+  }
+
+  // CHECK-LABEL: spirv.module @{{.*}} Physical64 OpenCL
+  gpu.module @kernels {
+    // CHECK-DAG: spirv.GlobalVariable @[[WORKGROUPSIZEVAR:.*]] built_in("WorkgroupSize") : !spirv.ptr<vector<3xi64>, Input>
+    // CHECK-DAG: spirv.GlobalVariable @[[NUMWORKGROUPSVAR:.*]] built_in("NumWorkgroups") : !spirv.ptr<vector<3xi64>, Input>
+    // CHECK-DAG: spirv.GlobalVariable @[[$LOCALINVOCATIONIDVAR:.*]] built_in("LocalInvocationId") : !spirv.ptr<vector<3xi64>, Input>
+    // CHECK-DAG: spirv.GlobalVariable @[[$WORKGROUPIDVAR:.*]] built_in("WorkgroupId") : !spirv.ptr<vector<3xi64>, Input>
+    // CHECK-LABEL:    spirv.func @load_store_kernel
+    // CHECK-SAME: %[[ARG0:[^\s]+]]: !spirv.ptr<!spirv.array<48 x f32>, CrossWorkgroup>
+    // CHECK-SAME: %[[ARG1:[^\s]+]]: !spirv.ptr<!spirv.array<48 x f32>, CrossWorkgroup>
+    // CHECK-SAME: %[[ARG2:[^\s]+]]: !spirv.ptr<!spirv.array<48 x f32>, CrossWorkgroup>
+    gpu.func @load_store_kernel(%arg0: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg1: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg2: memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>, %arg3: index, %arg4: index, %arg5: index, %arg6: index) kernel
+      attributes {gpu.known_block_size = array<i32: 1, 1, 1>, gpu.known_grid_size = array<i32: 16, 1, 1>, spirv.entry_point_abi = #spirv.entry_point_abi<>} {
+      // CHECK: %[[ADDRESSWORKGROUPID:.*]] = spirv.mlir.addressof @[[$WORKGROUPIDVAR]]
+      // CHECK: %[[WORKGROUPID:.*]] = spirv.Load "Input" %[[ADDRESSWORKGROUPID]]
+      // CHECK: %[[WORKGROUPIDX:.*]] = spirv.CompositeExtract %[[WORKGROUPID]]{{\[}}0 : i32{{\]}}
+      // CHECK: %[[ADDRESSLOCALINVOCATIONID:.*]] = spirv.mlir.addressof @[[$LOCALINVOCATIONIDVAR]]
+      // CHECK: %[[LOCALINVOCATIONID:.*]] = spirv.Load "Input" %[[ADDRESSLOCALINVOCATIONID]]
+      // CHECK: %[[LOCALINVOCATIONIDX:.*]] = spirv.CompositeExtract %[[LOCALINVOCATIONID]]{{\[}}0 : i32{{\]}}
+      %0 = gpu.block_id x
+      %1 = gpu.block_id y
+      %2 = gpu.block_id z
+      %3 = gpu.thread_id x
+      %4 = gpu.thread_id y
+      %5 = gpu.thread_id z
+      %6 = gpu.grid_dim x
+      %7 = gpu.grid_dim y
+      %8 = gpu.grid_dim z
+      %9 = gpu.block_dim x
+      %10 = gpu.block_dim y
+      %11 = gpu.block_dim z
+      // CHECK: %[[INDEX1:.*]] = spirv.IAdd
+      %12 = arith.addi %arg3, %0 : index
+      // CHECK: %[[INDEX2:.*]] = spirv.IAdd
+      %13 = arith.addi %arg4, %3 : index
+      // CHECK: %[[OFFSET1_0:.*]] = spirv.Constant 0 : i64
+      // CHECK: %[[STRIDE1_1:.*]] = spirv.Constant 4 : i64
+      // CHECK: %[[UPDATE1_1:.*]] = spirv.IMul %[[STRIDE1_1]], %[[INDEX1]] : i64
+      // CHECK: %[[OFFSET1_1:.*]] = spirv.IAdd %[[OFFSET1_0]], %[[UPDATE1_1]] : i64
+      // CHECK: %[[STRIDE1_2:.*]] = spirv.Constant 1 : i64
+      // CHECK: %[[UPDATE1_2:.*]] = spirv.IMul %[[STRIDE1_2]], %[[INDEX2]] : i64
+      // CHECK: %[[OFFSET1_2:.*]] = spirv.IAdd %[[OFFSET1_1]], %[[UPDATE1_2]] : i64
+      // CHECK: %[[PTR1:.*]] = spirv.AccessChain %[[ARG0]]{{\[}}%[[OFFSET1_2]]{{\]}} : !spirv.ptr<!spirv.array<48 x f32>, CrossWorkgroup>, i64
+      // CHECK-NEXT: %[[VAL1:.*]] = spirv.Load "CrossWorkgroup" %[[PTR1]] : f32
+      %14 = memref.load %arg0[%12, %13] : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>
+      // CHECK: %[[PTR2:.*]] = spirv.AccessChain %[[ARG1]]{{\[}}{{%.*}}{{\]}}
+      // CHECK-NEXT: %[[VAL2:.*]] = spirv.Load "CrossWorkgroup" %[[PTR2]]
+      %15 = memref.load %arg1[%12, %13] : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>
+      // CHECK: %[[VAL3:.*]] = spirv.FAdd %[[VAL1]], %[[VAL2]] : f32
+      %16 = arith.addf %14, %15 : f32
+      // CHECK: %[[PTR3:.*]] = spirv.AccessChain %[[ARG2]]{{\[}}{{%.*}}{{\]}}
+      // CHECK: spirv.Store "CrossWorkgroup" %[[PTR3]], %[[VAL3]] : f32
+      memref.store %16, %arg2[%12, %13] : memref<12x4xf32, #spirv.storage_class<CrossWorkgroup>>
+      // CHECK: spirv.Return
+      gpu.return
+    }
+  }
+}

@silee2
Copy link
Contributor Author

silee2 commented Oct 23, 2023

@kuhar @antiagainst Replacing this PR with #69941 since a new pass flag was not required for OpenCL kernel preparation for GPU compilation pipeline.

@kuhar kuhar closed this Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants