Skip to content

[OffloadBundler] Rework the ctor of OffloadTargetInfo to support AMDGPU's generic target #122629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions clang/docs/ClangOffloadBundler.rst
Original file line number Diff line number Diff line change
Expand Up @@ -266,15 +266,14 @@ without differentiation based on offload kind.
The target triple of the code object. See `Target Triple
<https://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_.

The bundler accepts target triples with or without the optional environment
field:
LLVM target triples can be with or without the optional environment field:

``<arch><sub>-<vendor>-<sys>``, or
``<arch><sub>-<vendor>-<sys>-<env>``

However, in order to standardize outputs for tools that consume bitcode
bundles, bundles written by the bundler internally use only the 4-field
target triple:
However, in order to standardize outputs for tools that consume bitcode bundles
and to parse target ID containing dashes, the bundler only accepts target
triples in the 4-field format:

``<arch><sub>-<vendor>-<sys>-<env>``

Expand Down Expand Up @@ -543,4 +542,4 @@ The compressed offload bundle begins with a header followed by the compressed bi
- **Compressed Data**:
The actual compressed binary data follows the header. Its size can be inferred from the total size of the file minus the header size.

> **Note**: Version 3 of the format is under development. It uses 64-bit fields for Total File Size and Uncompressed Binary Size to support files larger than 4GB. To experiment with version 3, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=3`. This support is experimental and not recommended for production use.
> **Note**: Version 3 of the format is under development. It uses 64-bit fields for Total File Size and Uncompressed Binary Size to support files larger than 4GB. To experiment with version 3, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=3`. This support is experimental and not recommended for production use.
5 changes: 5 additions & 0 deletions clang/include/clang/Driver/OffloadBundler.h
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,11 @@ class CompressedOffloadBundle {
static llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
decompress(const llvm::MemoryBuffer &Input, bool Verbose = false);
};

/// Check whether the bundle id is in the following format:
/// <kind>-<triple>[-<target id>[:target features]]
/// <triple> := <arch>-<vendor>-<os>-<env>
bool checkOffloadBundleID(const llvm::StringRef Str);
} // namespace clang

#endif // LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
78 changes: 52 additions & 26 deletions clang/lib/Driver/OffloadBundler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -83,32 +83,27 @@ OffloadTargetInfo::OffloadTargetInfo(const StringRef Target,
const OffloadBundlerConfig &BC)
: BundlerConfig(BC) {

// TODO: Add error checking from ClangOffloadBundler.cpp
auto TargetFeatures = Target.split(':');
auto TripleOrGPU = TargetFeatures.first.rsplit('-');

if (clang::StringToOffloadArch(TripleOrGPU.second) !=
clang::OffloadArch::UNKNOWN) {
auto KindTriple = TripleOrGPU.first.split('-');
this->OffloadKind = KindTriple.first;

// Enforce optional env field to standardize bundles
llvm::Triple t = llvm::Triple(KindTriple.second);
this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
t.getOSName(), t.getEnvironmentName());

this->TargetID = Target.substr(Target.find(TripleOrGPU.second));
} else {
auto KindTriple = TargetFeatures.first.split('-');
this->OffloadKind = KindTriple.first;

// Enforce optional env field to standardize bundles
llvm::Triple t = llvm::Triple(KindTriple.second);
this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
t.getOSName(), t.getEnvironmentName());

// <kind>-<triple>[-<target id>[:target features]]
// <triple> := <arch>-<vendor>-<os>-<env>
SmallVector<StringRef, 6> Components;
Target.split(Components, '-', /*MaxSplit=*/5);
assert((Components.size() == 5 || Components.size() == 6) &&
"malformed target string");

StringRef TargetIdWithFeature =
Components.size() == 6 ? Components.back() : "";
StringRef TargetId = TargetIdWithFeature.split(':').first;
if (!TargetId.empty() &&
clang::StringToOffloadArch(TargetId) != clang::OffloadArch::UNKNOWN)
this->TargetID = TargetIdWithFeature;
else
this->TargetID = "";
}

this->OffloadKind = Components.front();
ArrayRef<StringRef> TripleSlice{&Components[1], /*length=*/4};
llvm::Triple T = llvm::Triple(llvm::join(TripleSlice, "-"));
this->Triple = llvm::Triple(T.getArchName(), T.getVendorName(), T.getOSName(),
T.getEnvironmentName());
}

bool OffloadTargetInfo::hasHostKind() const {
Expand Down Expand Up @@ -148,7 +143,18 @@ bool OffloadTargetInfo::operator==(const OffloadTargetInfo &Target) const {
}

std::string OffloadTargetInfo::str() const {
return Twine(OffloadKind + "-" + Triple.str() + "-" + TargetID).str();
std::string NormalizedTriple;
// Unfortunately we need some special sauce for AMDGPU because all the runtime
// assumes the triple to be "amdgcn-amd-amdhsa-" (empty environment) instead
// of "amdgcn-amd-amdhsa-unknown". It's gonna be very tricky to patch
// different layers of runtime.
if (Triple.isAMDGPU()) {
NormalizedTriple = Triple.normalize(Triple::CanonicalForm::THREE_IDENT);
NormalizedTriple.push_back('-');
} else {
NormalizedTriple = Triple.normalize(Triple::CanonicalForm::FOUR_IDENT);
}
return Twine(OffloadKind + "-" + NormalizedTriple + "-" + TargetID).str();
}

static StringRef getDeviceFileExtension(StringRef Device,
Expand Down Expand Up @@ -1507,6 +1513,9 @@ Error OffloadBundler::UnbundleFiles() {
StringMap<StringRef> Worklist;
auto Output = BundlerConfig.OutputFileNames.begin();
for (auto &Triple : BundlerConfig.TargetNames) {
if (!checkOffloadBundleID(Triple))
return createStringError(errc::invalid_argument,
"invalid bundle id from bundle config");
Worklist[Triple] = *Output;
++Output;
}
Expand All @@ -1526,6 +1535,9 @@ Error OffloadBundler::UnbundleFiles() {

StringRef CurTriple = **CurTripleOrErr;
assert(!CurTriple.empty());
if (!checkOffloadBundleID(CurTriple))
return createStringError(errc::invalid_argument,
"invalid bundle id read from the bundle");

auto Output = Worklist.begin();
for (auto E = Worklist.end(); Output != E; Output++) {
Expand Down Expand Up @@ -1584,6 +1596,8 @@ Error OffloadBundler::UnbundleFiles() {
return createFileError(E.second, EC);

// If this entry has a host kind, copy the input file to the output file.
// We don't need to check E.getKey() here through checkOffloadBundleID
// because the entire WorkList has been checked above.
auto OffloadInfo = OffloadTargetInfo(E.getKey(), BundlerConfig);
if (OffloadInfo.hasHostKind())
OutputFile.write(Input.getBufferStart(), Input.getBufferSize());
Expand Down Expand Up @@ -1813,6 +1827,10 @@ Error OffloadBundler::UnbundleArchive() {
// archive.
while (!CodeObject.empty()) {
SmallVector<StringRef> CompatibleTargets;
if (!checkOffloadBundleID(CodeObject)) {
return createStringError(errc::invalid_argument,
"Invalid bundle id read from code object");
}
auto CodeObjectInfo = OffloadTargetInfo(CodeObject, BundlerConfig);
if (getCompatibleOffloadTargets(CodeObjectInfo, CompatibleTargets,
BundlerConfig)) {
Expand Down Expand Up @@ -1894,3 +1912,11 @@ Error OffloadBundler::UnbundleArchive() {

return Error::success();
}

bool clang::checkOffloadBundleID(const llvm::StringRef Str) {
// <kind>-<triple>[-<target id>[:target features]]
// <triple> := <arch>-<vendor>-<os>-<env>
SmallVector<StringRef, 6> Components;
Str.split(Components, '-', /*MaxSplit=*/5);
return Components.size() == 5 || Components.size() == 6;
}
6 changes: 4 additions & 2 deletions clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8978,7 +8978,8 @@ void OffloadBundler::ConstructJob(Compilation &C, const JobAction &JA,
}
Triples += Action::GetOffloadKindName(CurKind);
Triples += '-';
Triples += CurTC->getTriple().normalize();
Triples +=
CurTC->getTriple().normalize(llvm::Triple::CanonicalForm::FOUR_IDENT);
if ((CurKind == Action::OFK_HIP || CurKind == Action::OFK_Cuda) &&
!StringRef(CurDep->getOffloadingArch()).empty()) {
Triples += '-';
Expand Down Expand Up @@ -9072,7 +9073,8 @@ void OffloadBundler::ConstructJobMultipleOutputs(
auto &Dep = DepInfo[I];
Triples += Action::GetOffloadKindName(Dep.DependentOffloadKind);
Triples += '-';
Triples += Dep.DependentToolChain->getTriple().normalize();
Triples += Dep.DependentToolChain->getTriple().normalize(
llvm::Triple::CanonicalForm::FOUR_IDENT);
if ((Dep.DependentOffloadKind == Action::OFK_HIP ||
Dep.DependentOffloadKind == Action::OFK_Cuda) &&
!Dep.DependentBoundArch.empty()) {
Expand Down
3 changes: 2 additions & 1 deletion clang/lib/Driver/ToolChains/CommonArgs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2577,7 +2577,8 @@ static void GetSDLFromOffloadArchive(
SmallString<128> DeviceTriple;
DeviceTriple += Action::GetOffloadKindName(JA.getOffloadingDeviceKind());
DeviceTriple += '-';
std::string NormalizedTriple = T.getToolChain().getTriple().normalize();
std::string NormalizedTriple = T.getToolChain().getTriple().normalize(
llvm::Triple::CanonicalForm::FOUR_IDENT);
DeviceTriple += NormalizedTriple;
if (!Target.empty()) {
DeviceTriple += '-';
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Driver/ToolChains/HIPUtility.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ static std::string normalizeForBundler(const llvm::Triple &T,
return HasTargetID ? (T.getArchName() + "-" + T.getVendorName() + "-" +
T.getOSName() + "-" + T.getEnvironmentName())
.str()
: T.normalize();
: T.normalize(llvm::Triple::CanonicalForm::FOUR_IDENT);
}

// Collect undefined __hip_fatbin* and __hip_gpubin_handle* symbols from all
Expand Down
6 changes: 3 additions & 3 deletions clang/test/Driver/clang-offload-bundler-asserts-on.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,18 @@
// Check code object compatibility for archive unbundling
//
// Create few code object bundles and archive them to create an input archive
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.simple.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.simple.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:sramecc+:xnack+,openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack+ -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID1.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-,openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack- -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID2.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:xnack-,openmp-amdgcn-amd-amdhsa--gfx908:xnack- -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID3.bundle
// RUN: llvm-ar cr %t.input-archive.a %t.simple.bundle %t.targetID1.bundle %t.targetID2.bundle %t.targetID3.bundle

// Tests to check compatibility between Bundle Entry ID formats i.e. between presence/absence of extra hyphen in case of missing environment field
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa-gfx908 -input=%t.input-archive.a -output=%t-archive-gfx906-simple.a -output=%t-archive-gfx908-simple.a -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLECOMPATIBILITY
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.input-archive.a -output=%t-archive-gfx906-simple.a -output=%t-archive-gfx908-simple.a -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLECOMPATIBILITY
// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa--gfx906] : [Target: openmp-amdgcn-amd-amdhsa--gfx906]
// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa--gfx908] : [Target: openmp-amdgcn-amd-amdhsa--gfx908]

// RUN: clang-offload-bundler -unbundle -type=a -targets=hip-amdgcn-amd-amdhsa--gfx906,hipv4-amdgcn-amd-amdhsa-gfx908 -input=%t.input-archive.a -output=%t-hip-archive-gfx906-simple.a -output=%t-hipv4-archive-gfx908-simple.a -hip-openmp-compatible -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=HIPOpenMPCOMPATIBILITY
// RUN: clang-offload-bundler -unbundle -type=a -targets=hip-amdgcn-amd-amdhsa--gfx906,hipv4-amdgcn-amd-amdhsa--gfx908 -input=%t.input-archive.a -output=%t-hip-archive-gfx906-simple.a -output=%t-hipv4-archive-gfx908-simple.a -hip-openmp-compatible -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=HIPOpenMPCOMPATIBILITY
// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible [CodeObject: openmp-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible [CodeObject: openmp-amdgcn-amd-amdhsa--gfx908] : [Target: hipv4-amdgcn-amd-amdhsa--gfx908]

Expand Down
18 changes: 5 additions & 13 deletions clang/test/Driver/clang-offload-bundler-standardize.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,20 +15,12 @@
//
// Check code object compatibility for archive unbundling
//
// Create an object bundle with and without env fields
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,hip-amdgcn-amd-amdhsa-gfx906,hip-amdgcn-amd-amdhsa-gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.no.env
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple-,hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.env
// Create an object bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle


// Unbundle bundle.no.env while providing targets with env
// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.bundle.no.env -output=%t-hip-amdgcn-amd-amdhsa--gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa--gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-NO-ENV
// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx908] : [Target: hip-amdgcn-amd-amdhsa--gfx908]

// Unbundle bundle.env while providing targets with no env
// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa-gfx906,hip-amdgcn-amd-amdhsa-gfx908 -input=%t.bundle.env -output=%t-hip-amdgcn-amd-amdhsa-gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa-gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-ENV
// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx908] : [Target: hip-amdgcn-amd-amdhsa--gfx908]
// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.bundle -output=%t-hip-amdgcn-amd-amdhsa--gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa--gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE
// BUNDLE: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
// BUNDLE: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx908] : [Target: hip-amdgcn-amd-amdhsa--gfx908]

// Some code so that we can create a binary out of this file.
int A = 0;
Expand Down
Loading