Skip to content

[MCP] Handle iterative simplification during forward copy prop #140267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 35 additions & 64 deletions llvm/lib/CodeGen/MachineCopyPropagation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -870,12 +870,6 @@ void MachineCopyPropagation::forwardUses(MachineInstr &MI) {
++NumCopyForwards;
Changed = true;
}
// Attempt to canonicalize/optimize the instruction now its arguments have
// been mutated.
if (TII->simplifyInstruction(MI)) {
Changed = true;
LLVM_DEBUG(dbgs() << "MCP: After optimizeInstruction: " << MI);
}
}

void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
Expand All @@ -887,10 +881,8 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
std::optional<DestSourcePair> CopyOperands =
isCopyInstr(MI, *TII, UseCopyInstr);
if (CopyOperands) {

Register RegSrc = CopyOperands->Source->getReg();
Register RegDef = CopyOperands->Destination->getReg();

if (!TRI->regsOverlap(RegDef, RegSrc)) {
assert(RegDef.isPhysical() && RegSrc.isPhysical() &&
"MachineCopyPropagation should be run after register allocation!");
Expand All @@ -915,51 +907,6 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
// %ecx = COPY %eax
if (eraseIfRedundant(MI, Def, Src) || eraseIfRedundant(MI, Src, Def))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly OT for this review, but I think the eraseIfRedundant case could be removed. The key thing it does which the main copy propagation doesn't is reason about super-registers updating sub-registers in known ways. We could generalize the main pass, and I think this code would become redundant. I may explore this further, or may not.

continue;

forwardUses(MI);

// Src may have been changed by forwardUses()
CopyOperands = isCopyInstr(MI, *TII, UseCopyInstr);
Src = CopyOperands->Source->getReg().asMCReg();

// If Src is defined by a previous copy, the previous copy cannot be
// eliminated.
ReadRegister(Src, MI, RegularUse);
for (const MachineOperand &MO : MI.implicit_operands()) {
if (!MO.isReg() || !MO.readsReg())
continue;
MCRegister Reg = MO.getReg().asMCReg();
if (!Reg)
continue;
ReadRegister(Reg, MI, RegularUse);
}

LLVM_DEBUG(dbgs() << "MCP: Copy is a deletion candidate: "; MI.dump());

// Copy is now a candidate for deletion.
if (!MRI->isReserved(Def))
MaybeDeadCopies.insert(&MI);

// If 'Def' is previously source of another copy, then this earlier copy's
// source is no longer available. e.g.
// %xmm9 = copy %xmm2
// ...
// %xmm2 = copy %xmm0
// ...
// %xmm2 = copy %xmm9
Tracker.clobberRegister(Def, *TRI, *TII, UseCopyInstr);
for (const MachineOperand &MO : MI.implicit_operands()) {
if (!MO.isReg() || !MO.isDef())
continue;
MCRegister Reg = MO.getReg().asMCReg();
if (!Reg)
continue;
Tracker.clobberRegister(Reg, *TRI, *TII, UseCopyInstr);
}

Tracker.trackCopy(&MI, *TRI, *TII, UseCopyInstr);

continue;
}
}

Expand All @@ -977,20 +924,36 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {

forwardUses(MI);

// It's possible that the previous transformation has resulted in a no-op
// register move (i.e. one where source and destination registers are the
// same and are not referring to a reserved register). If so, delete it.
CopyOperands = isCopyInstr(MI, *TII, UseCopyInstr);
if (CopyOperands &&
CopyOperands->Source->getReg() == CopyOperands->Destination->getReg() &&
!MRI->isReserved(CopyOperands->Source->getReg())) {
MI.eraseFromParent();
NumDeletes++;
// Attempt to canonicalize/optimize the instruction now its arguments have
// been mutated. This may convert MI from a non-copy to a copy instruction.
if (TII->simplifyInstruction(MI)) {
Changed = true;
continue;
LLVM_DEBUG(dbgs() << "MCP: After simplifyInstruction: " << MI);
}

CopyOperands = isCopyInstr(MI, *TII, UseCopyInstr);
if (CopyOperands) {
Register RegSrc = CopyOperands->Source->getReg();
Register RegDef = CopyOperands->Destination->getReg();
// It's possible that the previous transformations have resulted in a
// no-op register move (i.e. one where source and destination registers
// are the same and are not referring to a reserved register). If so,
// delete it.
if (RegSrc == RegDef && !MRI->isReserved(RegSrc)) {
MI.eraseFromParent();
NumDeletes++;
Changed = true;
continue;
}

if (!TRI->regsOverlap(RegDef, RegSrc)) {
// Copy is now a candidate for deletion.
MCRegister Def = RegDef.asMCReg();
if (!MRI->isReserved(Def))
MaybeDeadCopies.insert(&MI);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, off topic, possible follow up. Looking at this code, it really looks like we have a nearly generic DCE lurking here. For a general instruction (not just copies), we could track if a def is used. If not and the instruction is safe to delete, we could do so. Not sure how worthwhile this is, but interesting.

}
}

// Not a copy.
SmallVector<Register, 4> Defs;
const MachineOperand *RegMask = nullptr;
for (const MachineOperand &MO : MI.operands()) {
Expand Down Expand Up @@ -1070,6 +1033,14 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
// Any previous copy definition or reading the Defs is no longer available.
for (MCRegister Reg : Defs)
Tracker.clobberRegister(Reg, *TRI, *TII, UseCopyInstr);

if (CopyOperands) {
Register RegSrc = CopyOperands->Source->getReg();
Register RegDef = CopyOperands->Destination->getReg();
if (!TRI->regsOverlap(RegDef, RegSrc)) {
Tracker.trackCopy(&MI, *TRI, *TII, UseCopyInstr);
}
}
}

bool TracksLiveness = MRI->tracksLiveness();
Expand Down
18 changes: 4 additions & 14 deletions llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
Original file line number Diff line number Diff line change
Expand Up @@ -27,24 +27,14 @@ define void @constant_fold_barrier_i128(ptr %p) {
; RV32-NEXT: lw a5, 12(a0)
; RV32-NEXT: slli a1, a1, 11
; RV32-NEXT: and a2, a2, a1
; RV32-NEXT: and a3, a3, zero
; RV32-NEXT: and a4, a4, zero
; RV32-NEXT: and a5, a5, zero
; RV32-NEXT: add a2, a2, a1
; RV32-NEXT: add a6, a3, zero
; RV32-NEXT: sltu a1, a2, a1
; RV32-NEXT: sltu a3, a3, a3
; RV32-NEXT: add a6, a6, a1
; RV32-NEXT: seqz a7, a6
; RV32-NEXT: mv a6, a1
; RV32-NEXT: seqz a7, a1
; RV32-NEXT: and a1, a7, a1
; RV32-NEXT: add a7, a4, zero
; RV32-NEXT: sltu a4, a4, a4
; RV32-NEXT: or a1, a3, a1
; RV32-NEXT: add a7, a7, a1
; RV32-NEXT: seqz a3, a7
; RV32-NEXT: mv a7, a1
; RV32-NEXT: seqz a3, a1
; RV32-NEXT: and a1, a3, a1
; RV32-NEXT: or a1, a4, a1
; RV32-NEXT: add a1, a5, a1
; RV32-NEXT: sw a2, 0(a0)
; RV32-NEXT: sw a6, 4(a0)
; RV32-NEXT: sw a7, 8(a0)
Expand Down
3 changes: 1 addition & 2 deletions llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,11 @@ define i128 @constant_fold_barrier_i128(i128 %x) {
; RV64-LABEL: constant_fold_barrier_i128:
; RV64: # %bb.0: # %entry
; RV64-NEXT: li a2, 1
; RV64-NEXT: and a1, a1, zero
; RV64-NEXT: slli a2, a2, 11
; RV64-NEXT: and a0, a0, a2
; RV64-NEXT: add a0, a0, a2
; RV64-NEXT: sltu a2, a0, a2
; RV64-NEXT: add a1, a1, a2
; RV64-NEXT: mv a1, a2
; RV64-NEXT: ret
entry:
%and = and i128 %x, 2048
Expand Down
29 changes: 7 additions & 22 deletions llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
Original file line number Diff line number Diff line change
Expand Up @@ -92,34 +92,26 @@ define i64 @udiv64_constant_no_add(i64 %a) nounwind {
; RV32-LABEL: udiv64_constant_no_add:
; RV32: # %bb.0:
; RV32-NEXT: lui a2, 838861
; RV32-NEXT: mulhu a3, a0, zero
; RV32-NEXT: addi a4, a2, -819
; RV32-NEXT: addi a2, a2, -820
; RV32-NEXT: mul a5, a1, a4
; RV32-NEXT: mul a6, a0, a2
; RV32-NEXT: mulhu a7, a0, a4
; RV32-NEXT: mul t0, zero, a4
; RV32-NEXT: mul t1, a1, a2
; RV32-NEXT: mulhu t2, a1, a4
; RV32-NEXT: mulhu a0, a0, a2
; RV32-NEXT: mulhu a1, a1, a2
; RV32-NEXT: mul a2, zero, a2
; RV32-NEXT: mulhu a4, zero, a4
; RV32-NEXT: add a5, a5, a6
; RV32-NEXT: add a2, t0, a2
; RV32-NEXT: add t0, t0, t1
; RV32-NEXT: add a1, a4, a1
; RV32-NEXT: mv t0, t1
; RV32-NEXT: sltu a4, a5, a6
; RV32-NEXT: add a5, a5, a7
; RV32-NEXT: sltu a6, t0, t1
; RV32-NEXT: sltiu t1, t0, 0
; RV32-NEXT: sltu a6, t1, t1
; RV32-NEXT: sltiu t1, t1, 0
; RV32-NEXT: add t0, t0, t2
; RV32-NEXT: add a1, a2, a1
; RV32-NEXT: sltu a2, a5, a7
; RV32-NEXT: add a6, a6, t1
; RV32-NEXT: sltu a5, t0, t2
; RV32-NEXT: add t0, t0, a0
; RV32-NEXT: add a1, a1, a3
; RV32-NEXT: add a2, a4, a2
; RV32-NEXT: add a5, a6, a5
; RV32-NEXT: sltu a0, t0, a0
Expand Down Expand Up @@ -156,34 +148,27 @@ define i64 @udiv64_constant_add(i64 %a) nounwind {
; RV32: # %bb.0:
; RV32-NEXT: lui a2, 599186
; RV32-NEXT: lui a3, 149797
; RV32-NEXT: mulhu a4, a0, zero
; RV32-NEXT: addi a2, a2, 1171
; RV32-NEXT: addi a3, a3, -1756
; RV32-NEXT: mul a5, a1, a2
; RV32-NEXT: mul a6, a0, a3
; RV32-NEXT: mulhu a7, a0, a2
; RV32-NEXT: mul t0, zero, a2
; RV32-NEXT: mulhu t1, zero, a2
; RV32-NEXT: mulhu t2, a1, a3
; RV32-NEXT: add t1, t1, t2
; RV32-NEXT: mul t2, zero, a3
; RV32-NEXT: add t2, t0, t2
; RV32-NEXT: add t1, t2, t1
; RV32-NEXT: mv t1, t2
; RV32-NEXT: mul t2, a1, a3
; RV32-NEXT: mulhu a2, a1, a2
; RV32-NEXT: mulhu a3, a0, a3
; RV32-NEXT: add a5, a5, a6
; RV32-NEXT: add t0, t0, t2
; RV32-NEXT: mv t0, t2
; RV32-NEXT: sltu a6, a5, a6
; RV32-NEXT: add a5, a5, a7
; RV32-NEXT: sltu t2, t0, t2
; RV32-NEXT: sltu t2, t2, t2
; RV32-NEXT: sltu a5, a5, a7
; RV32-NEXT: sltiu a7, t0, 0
; RV32-NEXT: add t0, t0, a2
; RV32-NEXT: add a7, t2, a7
; RV32-NEXT: sltu a2, t0, a2
; RV32-NEXT: add t0, t0, a3
; RV32-NEXT: add a4, t1, a4
; RV32-NEXT: add a5, a6, a5
; RV32-NEXT: add a2, a7, a2
; RV32-NEXT: sltu a3, t0, a3
Expand All @@ -195,7 +180,7 @@ define i64 @udiv64_constant_add(i64 %a) nounwind {
; RV32-NEXT: add a2, a2, a3
; RV32-NEXT: sub a1, a1, a0
; RV32-NEXT: srli a5, a5, 1
; RV32-NEXT: add a2, a4, a2
; RV32-NEXT: add a2, t1, a2
; RV32-NEXT: sub a1, a1, a2
; RV32-NEXT: slli a0, a1, 31
; RV32-NEXT: srli a1, a1, 1
Expand Down
4 changes: 2 additions & 2 deletions llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
Original file line number Diff line number Diff line change
Expand Up @@ -385,7 +385,7 @@ define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
; RV32ZBB: # %bb.0:
; RV32ZBB-NEXT: j .LBB6_2
; RV32ZBB-NEXT: # %bb.1:
; RV32ZBB-NEXT: sltiu a0, zero, 0
; RV32ZBB-NEXT: li a0, 0
; RV32ZBB-NEXT: ret
; RV32ZBB-NEXT: .LBB6_2:
; RV32ZBB-NEXT: cpop a0, a0
Expand Down Expand Up @@ -415,7 +415,7 @@ define i1 @ctpop_i64_ugt_one(i64 %a) nounwind {
; RV32ZBB: # %bb.0:
; RV32ZBB-NEXT: j .LBB7_2
; RV32ZBB-NEXT: # %bb.1:
; RV32ZBB-NEXT: snez a0, zero
; RV32ZBB-NEXT: li a0, 0
; RV32ZBB-NEXT: ret
; RV32ZBB-NEXT: .LBB7_2:
; RV32ZBB-NEXT: cpop a0, a0
Expand Down
12 changes: 12 additions & 0 deletions llvm/test/CodeGen/RISCV/machine-copyprop-simplifyinstruction.mir
Original file line number Diff line number Diff line change
Expand Up @@ -742,3 +742,15 @@ body: |
renamable $x10 = MAXU renamable $x11, renamable $x11
PseudoRET implicit $x10
...
---
name: multipass
body: |
bb.0:
; CHECK-LABEL: name: multipass
; CHECK: renamable $x9 = ADDI $x0, 0
; CHECK-NEXT: PseudoRET implicit $x9
renamable $x11 = COPY $x0
renamable $x10 = SLLI renamable $x11, 13
renamable $x9 = SRLI renamable $x10, 13
PseudoRET implicit $x9
...