[mlir][gpu] Add the `OffloadEmbeddingAttr` offloading translation attr #78117

fabianmcg · 2024-01-15T02:16:55Z

This patch adds the offloading translation attribute. This attribute uses LLVM
offloading infrastructure to embed GPU binaries in the IR. At the program start,
the LLVM offloading mechanism registers kernels and variables with the runtime
library: CUDA RT, HIP RT, or LibOMPTarget.

The offloading mechanism relies on the runtime library to dispatch the correct
kernel based on the registered symbols.

This patch is 3/4 on introducing the OffloadEmbeddingAttr GPU translation
attribute.

Note: Ignore the base commits; those are being reviewed in PRs #78057, #78098,
and #78073.

This patch adds the TargetInfo attribute interface to the set of DLTI interfaces. Target information attributes provide essential information on the compilation target. This information includes the target triple identifier, the target chip identifier, and a string representation of the target features. This patch also adds this new interface to the NVVM and ROCDL GPU target attributes.

This patch adds the `OffloadHandler` utility class for creating LLVM offload entries. LLVM offload entries hold information on offload symbols; for example, for a GPU kernel, this includes its host address to identify the kernel and the kernel identifier in the binary. Arrays of offload entries can be used to register functions within the CUDA/HIP runtime. Libomptarget also uses these entries to register OMP target offload kernels and variables. This patch is 1/4 on introducing the `OffloadEmbeddingAttr` GPU translation attribute.

This patch adds the offloading translation attribute. This attribute uses LLVM offloading infrastructure to embed GPU binaries in the IR. At the program start, the LLVM offloading mechanism registers kernels and variables with the runtime library: CUDA RT, HIP RT, or LibOMPTarget. The offloading mechanism relies on the runtime library to dispatch the correct kernel based on the registered symbols. This patch is 3/4 on introducing the OffloadEmbeddingAttr GPU translation attribute. Note: Ignore the base commits; those are being reviewed in PRs llvm#78057, llvm#78098, and llvm#78073.

jsjodin · 2024-02-15T17:47:50Z

mlir/include/mlir/Target/LLVM/Offload.h

@@ -0,0 +1,61 @@
+//===- Offload.h - LLVM Target Offload --------------------------*- C++ -*-===//


It seems like this is doing the same kind of work that the OffloadInfoManager is doing. Would it be possible to refactor that to not have to add this class?

Currently, the OffloadInfoManager creates the entries and adds them to the omp_offloading_entries section. However, the OffloadInfoManager performs no explicit construction of the entry array needed by the binary descriptor. It's the linker's job to implicitly create the array using all the entries in the section.

The problem with this approach is that LLJIT doesn't handle the implicit creation of the array very well. To overcome this limitation of LLJIT, the attribute constructs the entry array explicitly.

In summary, this class can be removed up to an extent, but then JIT compilation is impossible, and a real linker is needed to obtain the final executable.

It sounds like Clang isn't able to be used with the LLJIT in that case, or if it does, then there is already a solution in Clang. I think making this work both for Clang and MLIR would be useful. If there is already a solution in Clang then it should be migrated to the OpenMPIRBuilder.

The real problem is the lack of comprehensive support of linker sections in LLJIT, so I wouldn't say clang, or the clang-linker-wrapper are at fault. The easiest solution that I found was complying with LLJIT.
I think @jhuber6 was looking into changing the registration mechanism of LibOMPTarget binaries, so maybe we can found a solution that works for all.

I don't have the full view of what LLJIT does here, but the use-case in clang is that we need each TU to be able to emit values that need to be registered by the runtime. There are a few alternate solutions to this, but having the linker handle it is the best overall. The rework I was talking about was to simply change the offloading entry struct so it's more generic.

How does LLJIT work exactly? If you put globals into a section they will generally appear in order, so if you had a pointer to the first and last globals in that section you could just traverse it once it's gone through the backend. This is somewhat similar to the COFF linker handling which just gives an object at the beginning and end of the others in that section.

There are a few alternate solutions to this, but having the linker handle it is the best overall.

I agree, I think the best solution would be to make LLJIT work.

The rework I was talking about was to simply change the offloading entry struct so it's more generic.

I see.

How does LLJIT work exactly?

Honestly, I'm not 100% sure, I only know that the same IR would work if linked with a regular linker and fail with LLJIT.
I asked around on LLJIT discord a couple months ago why it was not picking out the symbols and they didn't give an answer.

I'll inquire further with them and comeback with a more definitive answer.

fabianmcg added 2 commits January 13, 2024 23:45

fabianmcg mentioned this pull request Jan 15, 2024

[mlir][gpu] Add an offloading handler attribute to gpu.module #78047

Merged

fabianmcg added 3 commits January 16, 2024 02:14

Base commit, PR llvm#78073

24aa668

Base commit, PR llvm#78098

90064f0

fabianmcg force-pushed the offload-attr branch from b4ea6de to fe36b64 Compare January 16, 2024 02:26

jsjodin reviewed Feb 15, 2024

View reviewed changes

fabianmcg mentioned this pull request Apr 23, 2025

[mlir][gpu] Change GPU modules to globals #135478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][gpu] Add the `OffloadEmbeddingAttr` offloading translation attr #78117

[mlir][gpu] Add the `OffloadEmbeddingAttr` offloading translation attr #78117

Uh oh!

fabianmcg commented Jan 15, 2024

Uh oh!

jsjodin Feb 15, 2024

Uh oh!

fabianmcg Feb 16, 2024 •

edited

Loading

Uh oh!

jsjodin Feb 16, 2024

Uh oh!

fabianmcg Feb 20, 2024

Uh oh!

jhuber6 Feb 20, 2024

Uh oh!

fabianmcg Feb 21, 2024

Uh oh!

Uh oh!

		@@ -0,0 +1,61 @@
		//===- Offload.h - LLVM Target Offload --------------------------- C++ --===//

[mlir][gpu] Add the OffloadEmbeddingAttr offloading translation attr #78117

Are you sure you want to change the base?

[mlir][gpu] Add the OffloadEmbeddingAttr offloading translation attr #78117

Uh oh!

Conversation

fabianmcg commented Jan 15, 2024

Uh oh!

jsjodin Feb 15, 2024

Choose a reason for hiding this comment

Uh oh!

fabianmcg Feb 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsjodin Feb 16, 2024

Choose a reason for hiding this comment

Uh oh!

fabianmcg Feb 20, 2024

Choose a reason for hiding this comment

Uh oh!

jhuber6 Feb 20, 2024

Choose a reason for hiding this comment

Uh oh!

fabianmcg Feb 21, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

[mlir][gpu] Add the `OffloadEmbeddingAttr` offloading translation attr #78117

[mlir][gpu] Add the `OffloadEmbeddingAttr` offloading translation attr #78117

fabianmcg Feb 16, 2024 •

edited

Loading