Skip to content

[AMDGPU][SplitModule] Handle !callees metadata #108802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 48 additions & 15 deletions llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -482,6 +482,29 @@ void SplitGraph::Node::visitAllDependencies(
}
}

/// Checks if \p I has MD_callees and if it does, parse it and put the function
/// in \p Callees.
///
/// \returns true if there was metadata and it was parsed correctly. false if
/// there was no MD or if it contained unknown entries and parsing failed.
/// If this returns false, \p Callees will contain incomplete information
/// and must not be used.
static bool handleCalleesMD(const Instruction &I,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still think the call graph analysis should be handling this

Copy link
Contributor Author

@Pierre-vh Pierre-vh Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it land as is or do you want me to try changing the CG first?
I'm not sure if that'd need a RFC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can start with it here, but I've always found it to be a flaw. It's also busted for the CGInSCC order with aliases (which I guess we're now less dependent on)

SetVector<Function *> &Callees) {
auto *MD = I.getMetadata(LLVMContext::MD_callees);
if (!MD)
return false;

for (const auto &Op : MD->operands()) {
Function *Callee = mdconst::extract_or_null<Function>(Op);
if (!Callee)
return false;
Callees.insert(Callee);
}

return true;
}

void SplitGraph::buildGraph(CallGraph &CG) {
SplitModuleTimer SMT("buildGraph", "graph construction");
LLVM_DEBUG(
Expand Down Expand Up @@ -519,28 +542,38 @@ void SplitGraph::buildGraph(CallGraph &CG) {
Fn.printAsOperand(dbgs());
dbgs() << " - analyzing function\n");

bool HasIndirectCall = false;
SetVector<Function *> KnownCallees;
bool HasUnknownIndirectCall = false;
for (const auto &Inst : instructions(Fn)) {
// look at all calls without a direct callee.
if (const auto *CB = dyn_cast<CallBase>(&Inst);
CB && !CB->getCalledFunction()) {
// inline assembly can be ignored, unless InlineAsmIsIndirectCall is
// true.
if (CB->isInlineAsm()) {
LLVM_DEBUG(dbgs() << " found inline assembly\n");
continue;
}

// everything else is handled conservatively.
HasIndirectCall = true;
break;
const auto *CB = dyn_cast<CallBase>(&Inst);
if (!CB || CB->getCalledFunction())
continue;

// inline assembly can be ignored, unless InlineAsmIsIndirectCall is
// true.
if (CB->isInlineAsm()) {
LLVM_DEBUG(dbgs() << " found inline assembly\n");
continue;
}

if (handleCalleesMD(Inst, KnownCallees))
continue;
// If we failed to parse any !callees MD, or some was missing,
// the entire KnownCallees list is now unreliable.
Comment on lines +562 to +563
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like the verifier is missing validation on callees entries? You shouldn't really have to worry about parsing it

KnownCallees.clear();

// Everything else is handled conservatively. If we fall into the
// conservative case don't bother analyzing further.
HasUnknownIndirectCall = true;
break;
}

if (HasIndirectCall) {
if (HasUnknownIndirectCall) {
LLVM_DEBUG(dbgs() << " indirect call found\n");
FnsWithIndirectCalls.push_back(&Fn);
}
} else if (!KnownCallees.empty())
DirectCallees.insert(KnownCallees.begin(), KnownCallees.end());
}

Node &N = getNode(Cache, Fn);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
; RUN: sed -s 's/_MD_/, !callees !{ptr @CallCandidate0}/' %s | llvm-split -o %t -j 3 -mtriple amdgcn-amd-amdhsa
; RUN: llvm-dis -o - %t0 | FileCheck --check-prefix=CHECK0 --implicit-check-not=define %s
; RUN: llvm-dis -o - %t1 | FileCheck --check-prefix=CHECK1 --implicit-check-not=define %s
; RUN: llvm-dis -o - %t2 | FileCheck --check-prefix=CHECK2 --implicit-check-not=define %s

; RUN: sed -s 's/_MD_//g' %s | llvm-split -o %t-nomd -j 3 -mtriple amdgcn-amd-amdhsa
; RUN: llvm-dis -o - %t-nomd0 | FileCheck --check-prefix=CHECK-NOMD0 --implicit-check-not=define %s
; RUN: llvm-dis -o - %t-nomd1 | FileCheck --check-prefix=CHECK-NOMD1 --implicit-check-not=define %s
; RUN: llvm-dis -o - %t-nomd2 | FileCheck --check-prefix=CHECK-NOMD2 --implicit-check-not=define %s

; CHECK0: define internal void @HelperC
; CHECK0: define amdgpu_kernel void @C

; CHECK1: define hidden void @CallCandidate1
; CHECK1: define internal void @HelperB
; CHECK1: define amdgpu_kernel void @B

; CHECK2: define internal void @HelperA
; CHECK2: define hidden void @CallCandidate0
; CHECK2: define amdgpu_kernel void @A

; CHECK-NOMD0: define internal void @HelperC
; CHECK-NOMD0: define amdgpu_kernel void @C

; CHECK-NOMD1: define internal void @HelperB
; CHECK-NOMD1: define amdgpu_kernel void @B

; CHECK-NOMD2: define internal void @HelperA
; CHECK-NOMD2: define hidden void @CallCandidate0
; CHECK-NOMD2: define hidden void @CallCandidate1
; CHECK-NOMD2: define amdgpu_kernel void @A

@addrthief = global [2 x ptr] [ptr @CallCandidate0, ptr @CallCandidate1]

define internal void @HelperA(ptr %call) {
call void %call() _MD_
ret void
}

define internal void @CallCandidate0() {
ret void
}

define internal void @CallCandidate1() {
ret void
}

define internal void @HelperB() {
ret void
}

define internal void @HelperC() {
ret void
}

define amdgpu_kernel void @A(ptr %call) {
call void @HelperA(ptr %call)
ret void
}

define amdgpu_kernel void @B() {
call void @HelperB()
ret void
}

define amdgpu_kernel void @C() {
call void @HelperC()
ret void
}
Loading