Skip to content

Commit 5571e53

Browse files
committed
[AArch64][SME] Fix 'addvl' addressing to scavenged stackslot.
In https://reviews.llvm.org/D159196 we avoided stackslot scavenging when there was no FP available. But in the case where FP is available we need to actually prefer using the FP over the BP. This change affects more than just SME, but in general it should be an improvement, since any slot above the (address pointed to by) FP is always closer to FP than BP, so it makes sense to favour using the FP to address it. This also fixes the issue for SME, where this is not just preferred but required.
1 parent fe7a044 commit 5571e53

File tree

2 files changed

+8
-11
lines changed

2 files changed

+8
-11
lines changed

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2757,7 +2757,11 @@ StackOffset AArch64FrameLowering::resolveFrameOffsetReference(
27572757
bool FPOffsetFits = !ForSimm || FPOffset >= -256;
27582758
PreferFP |= Offset > -FPOffset && !SVEStackSize;
27592759

2760-
if (MFI.hasVarSizedObjects()) {
2760+
if (FPOffset >= 0) {
2761+
// If the FPOffset is positive, that'll always be best, as the SP/BP
2762+
// will be even further away.
2763+
UseFP = true;
2764+
} else if (MFI.hasVarSizedObjects()) {
27612765
// If we have variable sized objects, we can use either FP or BP, as the
27622766
// SP offset is unknown. We can use the base pointer if we have one and
27632767
// FP is not preferred. If not, we're stuck with using FP.
@@ -2769,11 +2773,6 @@ StackOffset AArch64FrameLowering::resolveFrameOffsetReference(
27692773
// else we can use BP and FP, but the offset from FP won't fit.
27702774
// That will make us scavenge registers which we can probably avoid by
27712775
// using BP. If it won't fit for BP either, we'll scavenge anyway.
2772-
} else if (FPOffset >= 0) {
2773-
// Use SP or FP, whichever gives us the best chance of the offset
2774-
// being in range for direct access. If the FPOffset is positive,
2775-
// that'll always be best, as the SP will be even further away.
2776-
UseFP = true;
27772776
} else if (MF.hasEHFunclets() && !RegInfo->hasBasePointer(MF)) {
27782777
// Funclets access the locals contained in the parent's stack frame
27792778
// via the frame pointer, so we have to use the FP in the parent

llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -60,19 +60,17 @@ define void @test_no_stackslot_scavenging_with_fp(float %f, i64 %n) #0 "frame-po
6060
; CHECK-NEXT: stp x24, x19, [sp, #112] // 16-byte Folded Spill
6161
; CHECK-NEXT: addvl sp, sp, #-1
6262
; CHECK-NEXT: lsl x9, x0, #3
63+
; CHECK-NEXT: mov x8, sp
6364
; CHECK-NEXT: mov x19, sp
64-
; CHECK-NEXT: addvl x8, x19, #1
65+
; CHECK-NEXT: str s0, [x29, #28] // 4-byte Folded Spill
6566
; CHECK-NEXT: add x9, x9, #15
66-
; CHECK-NEXT: str s0, [x8, #92] // 4-byte Folded Spill
67-
; CHECK-NEXT: mov x8, sp
6867
; CHECK-NEXT: and x9, x9, #0xfffffffffffffff0
6968
; CHECK-NEXT: sub x8, x8, x9
7069
; CHECK-NEXT: mov sp, x8
7170
; CHECK-NEXT: //APP
7271
; CHECK-NEXT: //NO_APP
7372
; CHECK-NEXT: smstop sm
74-
; CHECK-NEXT: addvl x8, x19, #1
75-
; CHECK-NEXT: ldr s0, [x8, #92] // 4-byte Folded Reload
73+
; CHECK-NEXT: ldr s0, [x29, #28] // 4-byte Folded Reload
7674
; CHECK-NEXT: bl use_f
7775
; CHECK-NEXT: smstart sm
7876
; CHECK-NEXT: sub sp, x29, #64

0 commit comments

Comments
 (0)