[MLIR] Enabling Intel GPU Integration. #65539

silee2 · 2023-09-06T22:01:42Z

MLIR currently doesn't have a gpu runtime for Intel GPU.
This PR adds components required to enable lowering from GPU dialect and execute on Intel GPU including a new Sycl runtime wrapper.

List of changes:

Add new runtime wrapper library mlir_sycl_runtime that is used along with gpu-to-llvm pass
Update gpu-to-llvm pass to support host shared memory allocation and address limitations from using spirv binary blob. Spirv blob does not have size embedded and size information has to be passed separately. Also number of parameters need to be passed as an additional argument then doing a kernel launch.
Add new bool pass option "use-opencl" to convert-gpu-to-spirv pass for OpenCL
Add gpu-serialize-to-spirv pass that wraps spirv serializer as a pass
Add cmake modules, FindSyclRuntime.cmake and FindLevelZero.cmake that helps find sycl runtime and level zero runtime. They are required for building sycl runtime wrapper.
Add unit test for serializer pass
Add GPU/Sycl Integration test

This PR is monolithic for now and will be broken down to smaller pieces as needed.

Integration workflow:
GPU dialect lowering and execution workflow for Intel GPU added by this PR works as follows. It is similar to CUDA and ROCm workflow but uses spirv dialect for device code IR instead of llvm dialect.

Device code is lowered from gpu.func to spirv dialect using existing convert-gpu-to-spirv pass.
Then gpu-serialize-to-spirv pass is called to serialize spirv and attach serialized binary blob as an attribute to gpu.module
Host code is lowered by gpu-to-llvm pass and gpu dialect ops are converted in to llvm calls to various entries in the sycl runtime wrapper.
The resulting lowered code from both host and device code is now entirely in LLVM dialect and that is piped into mlir-cpu-runner for JIT execution with sycl runtime wrapper loaded as a shared library.

Limitations:
Sycl runtime wrapper in the PR is an initial version and does not implement all entries supported by CUDA and ROCm.
Intel GPU's share local memory is not supported.
Bare pointers are used for passing parameters to device kernels and lacks support for handling dynamic shaped memrefs.

Add CMake option MLIR_ENABLE_SYCL_RUNNER

Jianhui-Li · 2023-09-06T22:38:29Z

@rengolin @joker-eph @Hardcode84

FYI that this PR enables the current GPU dialect on Intel GPU as is, without introducing stream/queue to the current GPU dialect.

Hardcode84 · 2023-09-06T22:52:27Z

I suggest to extract mgpu interface changes and serializetoSpirv pass to 2 separate PRs.

silee2 · 2023-09-06T22:57:29Z

I suggest to extract mgpu interface changes and serializetoSpirv pass to 2 separate PRs.

Agree. And the changes are originally from different authors so should be splitted.

joker-eph

LGTM overall, this should likely get reviewed by @antiagainst / @kuhar ; and it's be great if you can split the independent changes and send them ahead of the e2e integration.

joker-eph · 2023-09-06T23:52:18Z

mlir/lib/Dialect/GPU/Transforms/SerializeToSPIRV.cpp

+        return spvModName->consume_front("__spv__") && spvModName == name;
+      };
+      auto spvMods = mod.getOps<spirv::ModuleOp>();
+      auto it = llvm::find_if(spvMods, isSameMod);


This is really costly, can you build a symbol table once at the beginning of the function and use it for the queries instead?

joker-eph · 2023-09-06T23:52:42Z

mlir/lib/Dialect/GPU/Transforms/SerializeToSPIRV.cpp

+      spvMod->erase();
+    }
+  }
+};


@fabianmcg : how would that fit in the new serialization flow?

GpuSerializeToSPIRVPass can definitely be integrated as an SPIR-V target attribute. Basically GpuSerializeToSPIRVPass::run would be the body of SPIRVTargetAttr::serializeToObject. However, we might need to modify the signature of serializeToObject to include a symbol table to make the SPIR-V serialization efficient, then serializeToObject could query the SPIR-V module name directly.

With that, gpu-module-to-binary would be able to take care of the serialization like with NVIDIA & AMD.

Then we also need to modify SelectObjectAttr. I think the only modification needed is adding the extra parameters in mgpuModuleLoad, etc. Another option is creating a dedicated SPIRVObjectAttr but so far I don't think it's needed.

Overall I think those would be the only 2 changes we would need to do to integrate this patch into the new workflow.

Awesome! @silee2 : can you adjust in this direction? I'm sure Fabian can support you if you have more questions!

@silee2 here are the steps:

Implement a target attribute, see for example: NVVMTargetAttr. The idea of this attribute is to hold properties intrinsic to the target, like triple, chip, flags, etc.

Add a pass to attach the target to a module, see: GpuNVVMAttachTarget and Dialect/GPU/Transforms/NVVMAttachTarget.cpp. The idea of this pass is to attach the SPIRV target to GPU modules, so it must know how to create them.

We're currently implementing TargetAttrs as external models to keep libraries separated, see NVVM/Target.cpp, so GpuSerializeToSPIRVPass::run would be there.

Modify getModuleLoadFn & createKernelLaunch appropriately in SelectObjectAttr.cpp#L125-L15 instead of adding the changes in GPUToLLVMCommon.

Then the compilation workflow should look something similar to this: GPU: Compilation Overview

I'll take care of adding a pointer to the top module symbol table so it can be used be the SPIRVTarget.

If you have any questions just ping me in discord or discourse @fabianmc.

joker-eph · 2023-09-06T23:54:44Z

mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp

@@ -71,7 +71,8 @@ void GPUToSPIRVPass::runOnOperation() {
      std::unique_ptr<ConversionTarget> target =
          spirv::getMemorySpaceToStorageClassTarget(*context);
      spirv::MemorySpaceToStorageClassMap memorySpaceMap =
-          spirv::mapMemorySpaceToVulkanStorageClass;
+          this->useOpenCL ? spirv::mapMemorySpaceToOpenCLStorageClass :
+              spirv::mapMemorySpaceToVulkanStorageClass;


Can you point out which unit-test is covering this? Can you send this in a separate PR?

Created #66445 which includes the code change and unit-test for covering.

joker-eph · 2023-09-06T23:54:47Z

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp

@@ -1158,7 +1178,7 @@ LogicalResult ConvertLaunchFuncOpToGpuRuntimeCallPattern::matchAndRewrite(
      {function.getResult(), adaptor.getGridSizeX(), adaptor.getGridSizeY(),
       adaptor.getGridSizeZ(), adaptor.getBlockSizeX(), adaptor.getBlockSizeY(),
       adaptor.getBlockSizeZ(), dynamicSharedMemorySize, stream, kernelParams,
-       /*extra=*/nullpointer});
+       /*extra=*/nullpointer, paramsCount});


I see a unit-test for this: but can you send everything related, but can you send everything related to paramsCount in a separate PR?

joker-eph · 2023-09-06T23:55:11Z

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp

  Value allocatedPtr =
-      allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+      allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+          .getResult();


I see a unit-test for this: but can you send everything related to isHostShared in a separate PR?

Regarding host_shared, I noticed this code in the examples:

%memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64>

Can SYCL's runtime allocate host_shared data asynchronously? It might be a good idea to prevent the use of host_shared and async together. FWIW, CUDA and HIP cannot do that. As far as I can see from the PR, the async queue is not used in sycl runtime when allocating host_shared. I am guessing there is no async allocation here as well.

Nonetheless, having async on gpu.alloc is perfectly acceptable. CUDA does support asynchronous device memory allocation.

Technically, SYCL provides a more abstract memory management with sycl::buffer and sycl::accessor defining an implicit asynchronous task graph. The allocation details are left to the implementation, asynchronous or synchronous allocation is left to the implementers.
Here the lower-level synchronous USM memory management API of SYCL is used instead, similar to CUDA/HIP memory management.
So, should the async allocation in the example be synchronous instead?

Technically, SYCL provides a more abstract memory management with sycl::buffer and sycl::accessor defining an implicit asynchronous task graph. The allocation details are left to the implementation, asynchronous or synchronous allocation is left to the implementers.

I haven't touched SYCL much, thanks for the explanation. Creating a task graph implicitly sounds interesting. In this case, SYCL users are ware of asynchrony while writing their program. In CUDA (or HIP), users choose sync or async execution.

Here the lower-level synchronous USM memory management API of SYCL is used instead, similar to CUDA/HIP memory management.

Yes that's correct. I don't think there is an USM that can do allocation asynchronously.

So, should the async allocation in the example be synchronous instead?

Yes, I think this is the correct behaviour. We can disallow host_shared and async on the Op.

Here are the possible IRs:

// Valid %memref = gpu.alloc host_shared (): memref<3x3xi64> // Valid %memref = gpu.alloc (): memref<3x3xi64> // Invalid, USM managers don't allocate async %memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64> // Valid, only for CUDA. Afaik, SYCL or HIP cannot do that %memref, %asyncToken = gpu.alloc async [%0] (): memref<3x3xi64>

the upstream GPUToLLVMConversion lowering does not support lowering of gpu.alloc which is not async.
https://github.com/llvm/llvm-project/blob/main/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp#L797

I guess that if the runtime uses actually synchronous allocation behind the scene and produces an always-ready async token, it works, even if non optimal.

the upstream GPUToLLVMConversion lowering does not support lowering of gpu.alloc which is not async.

Would that work if omit that check when host_shared is present?

keryell

Quite interesting!
At some point it would be nice to have some design document or documentation somewhere explaining how all these MLIR runners works, including this one.
Globally this PR add a SYCL runner, but it is very specific for Intel Level 0.
It would be nice to have in the future some generalization, like SYCL using OpenCL interoperability interface to run the SPIR-V kernels or even native kernels.

keryell · 2023-09-07T01:20:07Z

mlir/CMakeLists.txt

@@ -116,6 +116,7 @@ add_definitions(-DMLIR_ROCM_CONVERSIONS_ENABLED=${MLIR_ENABLE_ROCM_CONVERSIONS})

 set(MLIR_ENABLE_CUDA_RUNNER 0 CACHE BOOL "Enable building the mlir CUDA runner")
 set(MLIR_ENABLE_ROCM_RUNNER 0 CACHE BOOL "Enable building the mlir ROCm runner")
+set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir Sycl runner")


Please spell SYCL correctly.

Suggested change

set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir Sycl runner")

set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir SYCL runner")

One could argue that mlir should be spelled MLIR but the train seems to have left long time ago. :-)

keryell · 2023-09-07T01:37:09Z

mlir/lib/ExecutionEngine/SyclRuntimeWrappers.cpp

+
+// Create global device and context
+sycl::device syclDevice = getDefaultDevice();
+sycl::context syclContext = sycl::context(syclDevice);


Suggested change

sycl::context syclContext = sycl::context(syclDevice);

sycl::context syclContext { syclDevice };

keryell · 2023-09-07T01:39:50Z

mlir/lib/ExecutionEngine/SyclRuntimeWrappers.cpp

+
+  auto kernel = sycl::make_kernel<sycl::backend::ext_oneapi_level_zero>(
+      {kernelBundle, zeKernel}, syclContext);
+  syclKernel = new sycl::kernel(kernel);


Why this dynamic memory allocation?
Most of the SYCL objects have already this reference semantics already: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:reference-semantics

keryell · 2023-09-07T01:40:25Z

mlir/lib/ExecutionEngine/SyclRuntimeWrappers.cpp

+  sycl::nd_range<3> syclNdRange(
+      sycl::nd_range<3>(syclGlobalRange, syclLocalRange));


Suggested change

sycl::nd_range<3> syclNdRange(

sycl::nd_range<3>(syclGlobalRange, syclLocalRange));

sycl::nd_range<3> syclNdRange(syclGlobalRange, syclLocalRange);

keryell · 2023-09-07T01:41:33Z

mlir/lib/ExecutionEngine/SyclRuntimeWrappers.cpp

+  return syclKernel;
+}
+
+static void launchKernel(QUEUE *queue, sycl::kernel *kernel, size_t gridX,


Suggested change

static void launchKernel(QUEUE *queue, sycl::kernel *kernel, size_t gridX,

static void launchKernel(QUEUE queue, sycl::kernel kernel, size_t gridX,

or even use & if you are afraid of using the reference semantics of SYCL behind the scene.

keryell · 2023-09-07T01:43:17Z

mlir/lib/ExecutionEngine/SyclRuntimeWrappers.cpp

+                         size_t blockY, size_t blockZ, size_t sharedMemBytes,
+                         void **params, size_t paramsCount) {
+  auto syclGlobalRange =
+      ::sycl::range<3>(blockZ * gridZ, blockY * gridY, blockX * gridX);


Why are you using sometime ::sycl and sycl?

keryell · 2023-09-07T01:46:06Z

mlir/lib/ExecutionEngine/SyclRuntimeWrappers.cpp

+
+#pragma clang diagnostic pop
+
+struct QUEUE {


Why this spelling? Coding standard?
Why do you need this object? At the end this looks like a std::optional<sycl::queue>.

keryell · 2023-09-07T01:56:26Z

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

@@ -187,4 +187,8 @@ def GpuROCDLAttachTarget: Pass<"rocdl-attach-target", ""> {
  ];
 }

+def GpuSerializeToSPIRVPass : Pass<"gpu-serialize-to-spirv", "ModuleOp"> {
+  let summary = "Serialize spirv dialect to spirv binary";


Suggested change

let summary = "Serialize spirv dialect to spirv binary";

let summary = "Serialize spirv dialect to SPIR-Vbinary";

Missing a space?

keryell · 2023-09-07T01:57:58Z

mlir/cmake/modules/FindSyclRuntime.cmake

+#
+# Example usage:
+#
+# find_package(SyclRuntime)


Shouldn't it be

Suggested change

# find_package(SyclRuntime)

# find_package(SYCLRuntime)

everywhere?

keryell · 2023-09-07T02:01:04Z

mlir/lib/Dialect/GPU/Transforms/SerializeToSPIRV.cpp

+      auto spvMod = *it;
+
+      spvBinary.clear();
+      // serialize the spv module to spv binary


Suggested change

// serialize the spv module to spv binary

// serialize the SPIR-V module to SPIR-V binary

everywhere.

rengolin · 2023-09-07T08:52:41Z

At some point it would be nice to have some design document or documentation somewhere explaining how all these MLIR runners works, including this one.

The idea is to eventually consolidate all runners into one. This PR is just another piece of the puzzle.

Once we're all happy with how the runners work, we should common them up using command line options to select the "type" and CMake options to enable particular runner types (depending on the runtimes and hardware available).

Globally this PR add a SYCL runner, but it is very specific for Intel Level 0. It would be nice to have in the future some generalization, like SYCL using OpenCL interoperability interface to run the SPIR-V kernels or even native kernels.

Agreed! The SYCL runtime here is just being used to abstract the LevelZero calls, but this work will be helpful when adding a full SYCL runner (actual language extensions and libraries) to other CPUs/GPUs later.

rengolin · 2023-09-07T08:54:35Z

CI failure looks like Buildkite issue?

$ /etc/buildkite-agent/hooks/pre-checkout
--
  | BUILDKITE_REPO: https://github.com/llvm/llvm-project.git
  | fatal: not a git repository (or any parent up to mount point /var/lib)
  | Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  | 🚨 Error: The global pre-checkout hook exited with status 128

Jianhui-Li · 2023-09-07T19:03:36Z

At some point it would be nice to have some design document or documentation somewhere explaining how all these MLIR runners works, including this one.

The idea is to eventually consolidate all runners into one. This PR is just another piece of the puzzle.

Once we're all happy with how the runners work, we should common them up using command line options to select the "type" and CMake options to enable particular runner types (depending on the runtimes and hardware available).

Globally this PR add a SYCL runner, but it is very specific for Intel Level 0. It would be nice to have in the future some generalization, like SYCL using OpenCL interoperability interface to run the SPIR-V kernels or even native kernels.

Agreed! The SYCL runtime here is just being used to abstract the LevelZero calls, but this work will be helpful when adding a full SYCL runner (actual language extensions and libraries) to other CPUs/GPUs later.

Agree. The key point of this PR is not to create yet another runner, but just enable the existing mlir-cpu-runner to runs llvm + device kernel on SYCL platform. The current implementation uses L0 but could be extended.

I won't call this as adding a "SYCL runner", it is add SYCL backend to mlir-cpu-runner. The term "xxx runner" in MLIR is a bit overused. mlir-cpu-runner doesn't mean it runs only on CPU, instead it runs both the host and device kernel. mlir-cpu-runner currently does that but the name is a bit misleading. We really just need a "mlir-runner" with target-platform as command parameters.

joker-eph · 2023-09-07T23:03:10Z

mlir/lib/ExecutionEngine/SyclRuntimeWrappers.cpp

+}
+
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wglobal-constructors"


I don't think we have this anywhere in the codebase right now, can you remove it and use lazy initialization instead?

joker-eph · 2023-09-07T23:08:56Z

I won't call this as adding a "SYCL runner", it is add SYCL backend to mlir-cpu-runner. The term "xxx runner" in MLIR is a bit overused. mlir-cpu-runner doesn't mean it runs only on CPU, instead it runs both the host and device kernel. mlir-cpu-runner currently does that but the name is a bit misleading.

This naming stems originally that there was a mlir-cuda-runner which was actually implementing some “complex logic” that goes beyond just translating to LLVM and setting up the environment: they take an input in higher-level dialects (func+arith+gpu+…) and perform all the lowering.
We killed it to merge it into mlir-cpu-runner: ⚙ D98396 [mlir] Remove mlir-cuda-runner
This means that all the "target specific" transformation are done with mlir-opt (or other tools) and mlir-cpu-runner is a "host" entry point without any complex logic: it won't do other transformation on the IR and just setup the JIT, load some runtime, and invoke the code through the LLVM JIT infra.
(this PR follows this model, and it's great!)

Vulkan and Spirv still have dedicated runners on the model of the original cuda-runner, but I suspect this is just legacy?

We really just need a "mlir-runner" with target-platform as command parameters.

What kind of "target-platform" command parameters do you have in mind? (other than what we do now)

Jianhui-Li · 2023-09-09T04:54:54Z

We really just need a "mlir-runner" with target-platform as command parameters.

What kind of "target-platform" command parameters do you have in mind? (other than what we do now)

The current way of mlir-cpu-runner using the share library name to indicate target-platform looks good to me: Cuda, Rocm, and SYCL with this PR. Vulkan could be added same way. mlir-cpu-spirv-runner could be refactored to be mlir-opt passes generating spirv binary and feed to mlir-cpu-runner.

If we reach that state, the name "mlir-cpu-runner" could be promoted to "mlir-runner". That would clear up a lot of misunderstanding. The "mlir-runner" is really running MLIR programs on a target platform, and has little to do with enabling specific language or runtime features of a target platform. For example, we say "SYCL runner", people think it is about running SYCL program on top of MLIR.

…oduleToBinary (#65797) This patch adds the option of building an optional symbol table for the top operation in the `gpu-module-to-binary` pass. The table is not created by default as most targets don't need it; instead, it is lazily built. The table is passed through a callback in `TargetOptions`. This patch is required to integrate #65539 .

…oduleToBinary (llvm#65797) This patch adds the option of building an optional symbol table for the top operation in the `gpu-module-to-binary` pass. The table is not created by default as most targets don't need it; instead, it is lazily built. The table is passed through a callback in `TargetOptions`. This patch is required to integrate llvm#65539 .

…#66154) This PR is a breakdown of the big PR #65539 which enables intel gpu integration. In this PR we pass count of parameters and size of gpu binary to runtime wrappers since the SyclRuntimeWrappers (which will come in subsequent PR) requires the spirv size for compilation and also the number of parameters to iterate over the params.

This PR is a breakdown of the big PR #65539 which enables intel gpu integration. In this PR we pass hostShared flag to runtime wrappers (required by SyclRuntimeWrappers which will come in subsequent PR) to indicate if the allocation is done on host shared gpu memory or device only memory.

…llvm#66154) This PR is a breakdown of the big PR llvm#65539 which enables intel gpu integration. In this PR we pass count of parameters and size of gpu binary to runtime wrappers since the SyclRuntimeWrappers (which will come in subsequent PR) requires the spirv size for compilation and also the number of parameters to iterate over the params.

…#66401) This PR is a breakdown of the big PR llvm#65539 which enables intel gpu integration. In this PR we pass hostShared flag to runtime wrappers (required by SyclRuntimeWrappers which will come in subsequent PR) to indicate if the allocation is done on host shared gpu memory or device only memory.

antiagainst · 2023-11-26T19:53:56Z

Thanks for breaking down this pull request into various smaller pieces to make it easier for review. I looked at various pieces; LGTM. Looking forward to see this being supported! :)

from @joker-eph:

Vulkan and Spirv still have dedicated runners on the model of the original cuda-runner, but I suspect this is just legacy?

from @Jianhui-Li:

The current way of mlir-cpu-runner using the share library name to indicate target-platform looks good to me: Cuda, Rocm, and SYCL with this PR. Vulkan could be added same way. mlir-cpu-spirv-runner could be refactored to be mlir-opt passes generating spirv binary and feed to mlir-cpu-runner.

Yup; it's legacy. +1 to the idea of unifying! I've created #73457 to track this issue to make it more visible. I might not have the bandwidth to work on this; if somebody else is interested that'd be nice! So maked it as "good first issue" (not sure whether I'm pushing the limit of "good first issue" but hoping to get traction there).

silee2 · 2023-12-15T19:04:34Z

Closing as all sub-components has been merged.

silee2 and others added 17 commits August 30, 2023 13:44

Add SyclRuntimeWrappers and

863a72b

Add CMake option MLIR_ENABLE_SYCL_RUNNER

Fix config and build issues.

19bc391

Suppress clang compiler error.

330d04d

Merge remote-tracking branch 'origin/main' into find_syclruntime

55c095f

Add gpu serialize to spirv pass.

203d23e

Merge remote-tracking branch 'origin/main' into find_syclruntime

b4247f6

Add dependency.

a755e8f

Merge remote-tracking branch 'origin/main' into find_syclruntime

58e4e0d

Add Sycl Runtime Wrappers

1d5d046

Temp save. Builds but Integration test fails.

66fe69f

Update Sycl Runtime Wrappers

b0758f0

Add f32 integration test.

c37e8d8

Add new option to convert-gpu-to-spirv pass to handle OpenCL

84d584d

Update pass pipeline for integration tests.

2f8ab53

Fix.

41bc112

Remove unneeded test.

aae8d75

Merge remote-tracking branch 'origin/main' into test_sycl_e2e

3bb5bb4

github-actions bot added mlir:core MLIR Core Infrastructure mlir:gpu mlir labels Sep 6, 2023

joker-eph reviewed Sep 6, 2023

View reviewed changes

jdoerfert changed the title ~~Enabling Intel GPU Integration.~~ [MLIR] Enabling Intel GPU Integration. Sep 7, 2023

keryell reviewed Sep 7, 2023

View reviewed changes

joker-eph reviewed Sep 7, 2023

View reviewed changes

fabianmcg mentioned this pull request Sep 8, 2023

[mlir][gpu] Add a symbol table field to TargetOptions and adjust GpuModuleToBinary #65797

Merged

rengolin mentioned this pull request Sep 12, 2023

Look into OneDNN integration libxsmm/tpp-mlir#209

Closed

This was referenced Sep 12, 2023

[MLIR] Pass count of parameters & gpu binary size to runtime wrappers #66154

Merged

[MLIR] Pass hostShared flag in gpu.alloc op to runtime wrappers #66401

Merged

silee2 mentioned this pull request Sep 14, 2023

[MLIR][Conversion] Add new option "use-opencl" for "convert-gpu-to-spirv" conversion pass #66445

Closed

nbpatel mentioned this pull request Oct 19, 2023

[MLIR] Add SyclRuntimeWrapper #69648

Merged

This was referenced Oct 23, 2023

[mlir] Prepare convert-gpu-to-spirv for OpenCL support #69941

Merged

[mlir][spirv] Implement gpu::TargetAttrInterface #69949

Merged

nbpatel mentioned this pull request Oct 24, 2023

[MLIR] Modify lowering of gpu.alloc op to llvm #69969

Merged

silee2 mentioned this pull request Nov 6, 2023

[MLIR] Enable GPU Dialect to SYCL runtime integration #71430

Merged

antiagainst mentioned this pull request Nov 26, 2023

[mlir][spirv] Migrate mlir-vulkan-runner to follow other client API runners #73457

Closed

silee2 closed this Dec 15, 2023

sommerlukas mentioned this pull request Apr 3, 2024

[SYCL-MLIR] Alternate frontends intel/llvm#12990

Open

	set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir Sycl runner")
	set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir SYCL runner")

	sycl::context syclContext = sycl::context(syclDevice);
	sycl::context syclContext { syclDevice };

		sycl::nd_range<3> syclNdRange(
		sycl::nd_range<3>(syclGlobalRange, syclLocalRange));

	sycl::nd_range<3> syclNdRange(
	sycl::nd_range<3>(syclGlobalRange, syclLocalRange));
	sycl::nd_range<3> syclNdRange(syclGlobalRange, syclLocalRange);

	static void launchKernel(QUEUE queue, sycl::kernel kernel, size_t gridX,
	static void launchKernel(QUEUE queue, sycl::kernel kernel, size_t gridX,

	let summary = "Serialize spirv dialect to spirv binary";
	let summary = "Serialize spirv dialect to SPIR-Vbinary";

	// serialize the spv module to spv binary
	// serialize the SPIR-V module to SPIR-V binary

[MLIR] Enabling Intel GPU Integration. #65539

[MLIR] Enabling Intel GPU Integration. #65539

Conversation

silee2 commented Sep 6, 2023 • edited Loading

Jianhui-Li commented Sep 6, 2023

Hardcode84 commented Sep 6, 2023

silee2 commented Sep 6, 2023

joker-eph left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grypp Sep 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nbpatel Sep 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keryell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keryell Sep 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rengolin commented Sep 7, 2023

rengolin commented Sep 7, 2023

Jianhui-Li commented Sep 7, 2023

Choose a reason for hiding this comment

joker-eph commented Sep 7, 2023

Jianhui-Li commented Sep 9, 2023

antiagainst commented Nov 26, 2023

silee2 commented Dec 15, 2023

silee2 commented Sep 6, 2023 •

edited

Loading

grypp Sep 7, 2023 •

edited

Loading

nbpatel Sep 12, 2023 •

edited

Loading

keryell Sep 7, 2023 •

edited

Loading