-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[RISCV] Copy AVLs whose LiveIntervals aren't extendable in insertVSETVLI #98342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Copy AVLs whose LiveIntervals aren't extendable in insertVSETVLI #98342
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesCurrently we do a simple non-exhaustive check to see if a LiveInterval is extendable before forwarding an AVL. But we also need to check for this when we're extending the live range via merging the VSETVLIInfos in transferBefore with equally zero AVLs. Rather than trying to conservatively prevent these cases, this inserts a copy of the AVL instead if we don't know we'll be able to extend it. This is likely to be more robust, and even if the extra copy is undesirable these cases should be rare in practice. Full diff: https://github.com/llvm/llvm-project/pull/98342.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index 1f7d322be4d2a..b5296d3b3b8a1 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -950,12 +950,6 @@ void RISCVInsertVSETVLI::forwardVSETVLIAVL(VSETVLIInfo &Info) const {
VSETVLIInfo DefInstrInfo = getInfoForVSETVLI(*DefMI);
if (!DefInstrInfo.hasSameVLMAX(Info))
return;
- // If the AVL is a register with multiple definitions, don't forward it. We
- // might not be able to extend its LiveInterval without clobbering other val
- // nums.
- if (DefInstrInfo.hasAVLReg() &&
- !LIS->getInterval(DefInstrInfo.getAVLReg()).containsOneValue())
- return;
Info.setAVL(DefInstrInfo);
}
@@ -1149,15 +1143,26 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addImm(Info.encodeVTYPE());
if (LIS) {
LIS->InsertMachineInstrInMaps(*MI);
- // Normally the AVL's live range will already extend past the inserted
- // vsetvli because the pseudos below will already use the AVL. But this
- // isn't always the case, e.g. PseudoVMV_X_S doesn't have an AVL operand or
- // we've taken the AVL from the VL output of another vsetvli.
LiveInterval &LI = LIS->getInterval(AVLReg);
SlotIndex SI = LIS->getInstructionIndex(*MI).getRegSlot();
- assert((LI.liveAt(SI) && LI.getVNInfoAt(SI) == Info.getAVLVNInfo()) ||
- (!LI.liveAt(SI) && LI.containsOneValue()));
- LIS->extendToIndices(LI, SI);
+ // If the AVL value isn't live at MI, do a quick check to see if it's easily
+ // extendable. Otherwise, we need to copy it.
+ if (LI.getVNInfoBefore(SI) != Info.getAVLVNInfo()) {
+ if (!LI.liveAt(SI) && LI.containsOneValue())
+ LIS->extendToIndices(LI, SI);
+ else {
+ Register AVLCopyReg =
+ MRI->createVirtualRegister(&RISCV::GPRNoX0RegClass);
+ MachineBasicBlock::iterator AVLDef =
+ LIS->getInstructionFromIndex(Info.getAVLVNInfo()->def);
+ auto AVLCopy = BuildMI(*AVLDef->getParent(), std::next(AVLDef), DL,
+ TII->get(RISCV::COPY), AVLCopyReg)
+ .addReg(AVLReg);
+ LIS->InsertMachineInstrInMaps(*AVLCopy);
+ MI->getOperand(1).setReg(AVLCopyReg);
+ LIS->createAndComputeVirtRegInterval(AVLCopyReg);
+ }
+ }
}
}
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
index f93022c9d132d..f1fdb3adccea4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
@@ -1126,12 +1126,13 @@ exit:
ret void
}
-; Check that we don't forward an AVL if we wouldn't be able to extend its
-; LiveInterval without clobbering other val nos.
-define <vscale x 4 x i32> @unforwardable_avl(i64 %n, <vscale x 4 x i32> %v, i1 %cmp) {
-; CHECK-LABEL: unforwardable_avl:
+; Check that if we forward an AVL whose value is clobbered in its LiveInterval
+; we emit a copy instead.
+define <vscale x 4 x i32> @clobbered_forwarded_avl(i64 %n, <vscale x 4 x i32> %v, i1 %cmp) {
+; CHECK-LABEL: clobbered_forwarded_avl:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: vsetvli a2, a0, e32, m2, ta, ma
+; CHECK-NEXT: mv a2, a0
+; CHECK-NEXT: vsetvli zero, a0, e32, m2, ta, ma
; CHECK-NEXT: andi a1, a1, 1
; CHECK-NEXT: .LBB27_1: # %for.body
; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
index 8956ecd2a8bbf..fed0209d28863 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
@@ -134,7 +134,7 @@
ret void
}
- define void @unforwardable_avl() {
+ define void @clobberred_forwarded_avl() {
ret void
}
@@ -995,16 +995,17 @@ body: |
PseudoBR %bb.1
...
---
-name: unforwardable_avl
+name: clobberred_forwarded_avl
tracksRegLiveness: true
body: |
- ; CHECK-LABEL: name: unforwardable_avl
+ ; CHECK-LABEL: name: clobberred_forwarded_avl
; CHECK: bb.0:
; CHECK-NEXT: successors: %bb.1(0x80000000)
; CHECK-NEXT: liveins: $x10, $v8m2
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %avl:gprnox0 = COPY $x10
- ; CHECK-NEXT: %outvl:gprnox0 = PseudoVSETVLI %avl, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gprnox0 = COPY %avl
+ ; CHECK-NEXT: dead %outvl:gprnox0 = PseudoVSETVLI %avl, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1:
; CHECK-NEXT: successors: %bb.2(0x80000000)
@@ -1017,7 +1018,7 @@ body: |
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: dead [[PseudoVSETVLIX0_:%[0-9]+]]:gpr = PseudoVSETVLIX0 killed $x0, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: renamable $v10m2 = PseudoVADD_VV_M2 undef renamable $v10m2, renamable $v8m2, renamable $v8m2, -1, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
- ; CHECK-NEXT: dead $x0 = PseudoVSETVLI %outvl, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+ ; CHECK-NEXT: dead $x0 = PseudoVSETVLI [[COPY]], 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: renamable $v8m2 = PseudoVADD_VV_M2 undef renamable $v8m2, killed renamable $v10m2, renamable $v8m2, $noreg, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: PseudoRET implicit $v8m2
bb.0:
|
else { | ||
Register AVLCopyReg = | ||
MRI->createVirtualRegister(&RISCV::GPRNoX0RegClass); | ||
MachineBasicBlock::iterator AVLDef = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know how a PHIDef is handled here? Is the slot index simply mapped to the first instruction in the block? Or something else? Either way, we may need special handling for that case here.
For correctness, I'd appreciate if you did a local patch which removes the previous extend check and ensures everything works. The fallback path should be generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A PHI def's SlotIndex will be at the entry to the block, so it will insert a copy at the entry, which I believe is what we want.
I'd appreciate if you did a local patch which removes the previous extend check and ensures everything works
If I'm understanding you correctly, the previous extend check is removed in this patch already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A PHI def's SlotIndex will be at the entry to the block, so it will insert a copy at the entry, which I believe is what we want.
Are we guaranteed that the Slot_Block entry maps to an MachineInstr? Getting a MachineInstr can return nullptr, and I'm not clear about the exact circumstances.
I'd appreciate if you did a local patch which removes the previous extend check and ensures everything works
If I'm understanding you correctly, the previous extend check is removed in this patch already
You have:
if (!LI.liveAt(SI) && LI.containsOneValue())
LIS->extendToIndices(LI, SI);
I'm suggesting run "ninja check" (and other tests of your choice) with:
if (false && !LI.liveAt(SI) && LI.containsOneValue())
LIS->extendToIndices(LI, SI);
i.e. Force everything done the COPY path. This will hurt code quality, but should never violate invariants. (This is only a local to you test to make sure it works.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we guaranteed that the Slot_Block entry maps to an MachineInstr? Getting a MachineInstr can return nullptr, and I'm not clear about the exact circumstances.
Thanks for catching that, turns outgetInstructionFromIndex
does return null for that. I've added explicit handling for this to insert it with the other PHIs. We also didn't have any test cases that hit this with a PHI so I've added one.
Sorry I see what you mean about removing the extend check now, I got it confused with the check on line 953. I can confirm that does work and is generic.
Currently we do a simple non-exhaustive check to see if a LiveInterval is extendable before forwarding an AVL. But we also need to check for this when we're extending the live range via merging the VSETVLIInfos in transferBefore with equally zero AVLs. Rather than trying to conservatively prevent these cases, this inserts a copy of the AVL instead if we don't know we'll be able to extend it. This is likely to be more robust, and even if the extra copy is undesirable these cases should be rare in practice.
50b0a49
to
e285656
Compare
… non-phi As an aside, the first non phi should just be the first instruction since we're past phi elimination...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The vsetvli here is removed because we're relaxing the only-one-valno constraint for AVL forwarding
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/1803 Here is the relevant piece of the build log for the reference:
|
Currently before forwarding an AVL we do a simple non-exhaustive check to see if its LiveInterval is extendable. But we also need to check for this when we're extending an AVL's LiveInterval via merging the VSETVLIInfos in transferBefore with equally zero AVLs.
Rather than trying to conservatively prevent these cases, this inserts a copy of the AVL instead if we don't know we'll be able to extend it. This is likely to be more robust, and even if the extra copy is undesirable these cases should be rare in practice.