-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV] Align stack size down to a multiple of 16 before using cm.push/pop. #86073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…h/pop. This an alternative to llvm#84935 to fix the miscompile, but not be optimal. The immediate for cm.push/pop must be a multiple of 16. For RVE, it might not be. It's not easy to increase the stack size without messing up cfa directives and maybe other things. This patch rounds the stack size down to a multiple of 16 before clamping it to 48. This causes an extra addi to be emitted to handle the remainder. Once this commited, I can commit llvm#84989 to add verification for these instructions being generated with valid offsets.
@llvm/pr-subscribers-backend-risc-v Author: Craig Topper (topperc) ChangesThis an alternative to #84935 to fix the miscompile, but not be optimal. The immediate for cm.push/pop must be a multiple of 16. For RVE, it might not be. It's not easy to increase the stack size without messing up cfa directives and maybe other things. This patch rounds the stack size down to a multiple of 16 before clamping it to 48. This causes an extra addi to be emitted to handle the remainder. Once this commited, I can commit #84989 to add verification for these instructions being generated with valid offsets. Full diff: https://github.com/llvm/llvm-project/pull/86073.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index 8bac41372b5a83..2d56ea5979877e 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -554,8 +554,10 @@ void RISCVFrameLowering::emitPrologue(MachineFunction &MF,
if (RVFI->isPushable(MF) && FirstFrameSetup != MBB.end() &&
FirstFrameSetup->getOpcode() == RISCV::CM_PUSH) {
// Use available stack adjustment in push instruction to allocate additional
- // stack space.
- uint64_t Spimm = std::min(StackSize, (uint64_t)48);
+ // stack space. Align the stack size down to a multiple of 16. This is
+ // needed for RVE.
+ // FIXME: Can we increase the stack size to a multiple of 16 insead?
+ uint64_t Spimm = std::min(alignDown(StackSize, 16), (uint64_t)48);
FirstFrameSetup->getOperand(1).setImm(Spimm);
StackSize -= Spimm;
}
@@ -776,8 +778,10 @@ void RISCVFrameLowering::emitEpilogue(MachineFunction &MF,
if (RVFI->isPushable(MF) && MBBI != MBB.end() &&
MBBI->getOpcode() == RISCV::CM_POP) {
// Use available stack adjustment in pop instruction to deallocate stack
- // space.
- uint64_t Spimm = std::min(StackSize, (uint64_t)48);
+ // space. Align the stack size down to a multiple of 16. This is needed for
+ // RVE.
+ // FIXME: Can we increase the stack size to a multiple of 16 insead?
+ uint64_t Spimm = std::min(alignDown(StackSize, 16), (uint64_t)48);
MBBI->getOperand(1).setImm(Spimm);
StackSize -= Spimm;
}
diff --git a/llvm/test/CodeGen/RISCV/zcmp-additional-stack.ll b/llvm/test/CodeGen/RISCV/zcmp-additional-stack.ll
index e5c2e0180ee0a6..73ace203398501 100644
--- a/llvm/test/CodeGen/RISCV/zcmp-additional-stack.ll
+++ b/llvm/test/CodeGen/RISCV/zcmp-additional-stack.ll
@@ -3,7 +3,8 @@
define ptr @func(ptr %s, i32 %_c, ptr %incdec.ptr, i1 %0, i8 %conv14) #0 {
; RV32-LABEL: func:
; RV32: # %bb.0: # %entry
-; RV32-NEXT: cm.push {ra, s0-s1}, -24
+; RV32-NEXT: cm.push {ra, s0-s1}, -16
+; RV32-NEXT: addi sp, sp, -8
; RV32-NEXT: .cfi_def_cfa_offset 24
; RV32-NEXT: .cfi_offset ra, -12
; RV32-NEXT: .cfi_offset s0, -8
@@ -31,7 +32,8 @@ define ptr @func(ptr %s, i32 %_c, ptr %incdec.ptr, i1 %0, i8 %conv14) #0 {
; RV32-NEXT: lw a0, 4(sp) # 4-byte Folded Reload
; RV32-NEXT: sb a0, 0(s0)
; RV32-NEXT: mv a0, s1
-; RV32-NEXT: cm.popret {ra, s0-s1}, 24
+; RV32-NEXT: addi sp, sp, 8
+; RV32-NEXT: cm.popret {ra, s0-s1}, 16
entry:
br label %while.body
|
Ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This an alternative to #84935 to fix the miscompile, but not be optimal.
The immediate for cm.push/pop must be a multiple of 16. For RVE, it might not be. It's not easy to increase the stack size without messing up cfa directives and maybe other things.
This patch rounds the stack size down to a multiple of 16 before clamping it to 48. This causes an extra addi to be emitted to handle the remainder.
Once this commited, I can commit #84989 to add verification for these instructions being generated with valid offsets.