Skip to content

Commit 26a8664

Browse files
[MemProf] Handle missing tail call frames (#75823)
If tail call optimization was not disabled for the profiled binary, the call contexts will be missing frames for tail calls. Handle this by performing a limited search through tail call edges for the profiled callee when a discontinuity is detected. The search depth is adjustable but defaults to 5. If we are able to identify a short sequence of tail calls, update the graph for those calls. In the case of ThinLTO, synthesize the necessary CallsiteInfos for carrying the cloning information to the backends.
1 parent dc717b1 commit 26a8664

File tree

7 files changed

+978
-55
lines changed

7 files changed

+978
-55
lines changed

llvm/include/llvm/IR/ModuleSummaryIndex.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1011,6 +1011,12 @@ class FunctionSummary : public GlobalValueSummary {
10111011
return *Callsites;
10121012
}
10131013

1014+
void addCallsite(CallsiteInfo &Callsite) {
1015+
if (!Callsites)
1016+
Callsites = std::make_unique<CallsitesTy>();
1017+
Callsites->push_back(Callsite);
1018+
}
1019+
10141020
ArrayRef<AllocInfo> allocs() const {
10151021
if (Allocs)
10161022
return *Allocs;

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -459,9 +459,24 @@ class IndexBitcodeWriter : public BitcodeWriterBase {
459459
// Record all stack id indices actually used in the summary entries being
460460
// written, so that we can compact them in the case of distributed ThinLTO
461461
// indexes.
462-
for (auto &CI : FS->callsites())
462+
for (auto &CI : FS->callsites()) {
463+
// If the stack id list is empty, this callsite info was synthesized for
464+
// a missing tail call frame. Ensure that the callee's GUID gets a value
465+
// id. Normally we only generate these for defined summaries, which in
466+
// the case of distributed ThinLTO is only the functions already defined
467+
// in the module or that we want to import. We don't bother to include
468+
// all the callee symbols as they aren't normally needed in the backend.
469+
// However, for the synthesized callsite infos we do need the callee
470+
// GUID in the backend so that we can correlate the identified callee
471+
// with this callsite info (which for non-tail calls is done by the
472+
// ordering of the callsite infos and verified via stack ids).
473+
if (CI.StackIdIndices.empty()) {
474+
GUIDToValueIdMap[CI.Callee.getGUID()] = ++GlobalValueId;
475+
continue;
476+
}
463477
for (auto Idx : CI.StackIdIndices)
464478
StackIdIndices.push_back(Idx);
479+
}
465480
for (auto &AI : FS->allocs())
466481
for (auto &MIB : AI.MIBs)
467482
for (auto Idx : MIB.StackIdIndices)

0 commit comments

Comments
 (0)