Skip to content

Commit 008c875

Browse files
authored
[AMDGPU] Fix excessive stack usage in SIInsertWaitcnts::run (#134835)
Noticed on Windows when running LLVM as part of a graphics driver, with total stack usage limited to about 128 KB. In some cases this function would overflow the stack. On Linux this reduces stack usage in this function from about 32 KB to about 0.5 KB.
1 parent 7e1b76c commit 008c875

File tree

1 file changed

+9
-4
lines changed

1 file changed

+9
-4
lines changed

llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2623,12 +2623,17 @@ bool SIInsertWaitcnts::run(MachineFunction &MF) {
26232623
else
26242624
*Brackets = *BI.Incoming;
26252625
} else {
2626-
if (!Brackets)
2626+
if (!Brackets) {
26272627
Brackets = std::make_unique<WaitcntBrackets>(
26282628
ST, MaxCounter, Limits, WaitEventMaskForInst, SmemAccessCounter);
2629-
else
2630-
*Brackets = WaitcntBrackets(ST, MaxCounter, Limits,
2631-
WaitEventMaskForInst, SmemAccessCounter);
2629+
} else {
2630+
// Reinitialize in-place. N.B. do not do this by assigning from a
2631+
// temporary because the WaitcntBrackets class is large and it could
2632+
// cause this function to use an unreasonable amount of stack space.
2633+
Brackets->~WaitcntBrackets();
2634+
new (Brackets.get()) WaitcntBrackets(
2635+
ST, MaxCounter, Limits, WaitEventMaskForInst, SmemAccessCounter);
2636+
}
26322637
}
26332638

26342639
Modified |= insertWaitcntInBlock(MF, *MBB, *Brackets);

0 commit comments

Comments
 (0)