Skip to content

[MCP] Handle iterative simplification during forward copy prop #140267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

preames
Copy link
Collaborator

@preames preames commented May 16, 2025

This is the follow up I mentioned doing in the review of 52b345d. That change introduced an API for performing instruction simplifications following copy propagation (e.g. things like recognizing ORI a0, a1, zero is just a move). As noted in that review, we should be able to perform iterative simplification as we move forward through the block, but weren't because of the code structure.

The majority of this code is just deleting the special casing for constant source and destination tracking, and merging the copy handling with the main path. By assumption, the properties of copies (in terms of register reads and writes), must be a subset of general instructions.

Once we do that, the iterative bit basically falls out from having the tracking performed for copies which are recognized after we forward prior uses.

This is the follow up I mentioned doing in the review of 52b345d.
That change introduced an API for performing instruction
simplifications following copy propagation (e.g. things like
recognizing ORI a0, a1, zero is just a move).  As noted in that
review, we should be able to perform iterative simplification
as we move forward through the block, but weren't because of
the code structure.

The majority of this code is just deleting the special casing for
constant source and destination tracking, and merging the copy
handling with the main path.  By assumption, the properties of
copies (in terms of register reads and writes), must be a subset
of general instructions.

Once we do that, the iterative bit basically falls out from having
the tracking performed for copies which are recognized *after*
we forward prior uses.
@llvmbot
Copy link
Member

llvmbot commented May 16, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-llvm-regalloc

Author: Philip Reames (preames)

Changes

This is the follow up I mentioned doing in the review of 52b345d. That change introduced an API for performing instruction simplifications following copy propagation (e.g. things like recognizing ORI a0, a1, zero is just a move). As noted in that review, we should be able to perform iterative simplification as we move forward through the block, but weren't because of the code structure.

The majority of this code is just deleting the special casing for constant source and destination tracking, and merging the copy handling with the main path. By assumption, the properties of copies (in terms of register reads and writes), must be a subset of general instructions.

Once we do that, the iterative bit basically falls out from having the tracking performed for copies which are recognized after we forward prior uses.


Full diff: https://github.com/llvm/llvm-project/pull/140267.diff

6 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineCopyPropagation.cpp (+35-64)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll (+4-14)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll (+1-2)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll (+7-22)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/machine-copyprop-simplifyinstruction.mir (+12)
diff --git a/llvm/lib/CodeGen/MachineCopyPropagation.cpp b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
index 6eab87c1292e0..9cd07f1274799 100644
--- a/llvm/lib/CodeGen/MachineCopyPropagation.cpp
+++ b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
@@ -870,12 +870,6 @@ void MachineCopyPropagation::forwardUses(MachineInstr &MI) {
     ++NumCopyForwards;
     Changed = true;
   }
-  // Attempt to canonicalize/optimize the instruction now its arguments have
-  // been mutated.
-  if (TII->simplifyInstruction(MI)) {
-    Changed = true;
-    LLVM_DEBUG(dbgs() << "MCP: After optimizeInstruction: " << MI);
-  }
 }
 
 void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
@@ -887,10 +881,8 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
     std::optional<DestSourcePair> CopyOperands =
         isCopyInstr(MI, *TII, UseCopyInstr);
     if (CopyOperands) {
-
       Register RegSrc = CopyOperands->Source->getReg();
       Register RegDef = CopyOperands->Destination->getReg();
-
       if (!TRI->regsOverlap(RegDef, RegSrc)) {
         assert(RegDef.isPhysical() && RegSrc.isPhysical() &&
               "MachineCopyPropagation should be run after register allocation!");
@@ -915,51 +907,6 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
         //  %ecx = COPY %eax
         if (eraseIfRedundant(MI, Def, Src) || eraseIfRedundant(MI, Src, Def))
           continue;
-
-        forwardUses(MI);
-
-        // Src may have been changed by forwardUses()
-        CopyOperands = isCopyInstr(MI, *TII, UseCopyInstr);
-        Src = CopyOperands->Source->getReg().asMCReg();
-
-        // If Src is defined by a previous copy, the previous copy cannot be
-        // eliminated.
-        ReadRegister(Src, MI, RegularUse);
-        for (const MachineOperand &MO : MI.implicit_operands()) {
-          if (!MO.isReg() || !MO.readsReg())
-            continue;
-          MCRegister Reg = MO.getReg().asMCReg();
-          if (!Reg)
-            continue;
-          ReadRegister(Reg, MI, RegularUse);
-        }
-
-        LLVM_DEBUG(dbgs() << "MCP: Copy is a deletion candidate: "; MI.dump());
-
-        // Copy is now a candidate for deletion.
-        if (!MRI->isReserved(Def))
-          MaybeDeadCopies.insert(&MI);
-
-        // If 'Def' is previously source of another copy, then this earlier copy's
-        // source is no longer available. e.g.
-        // %xmm9 = copy %xmm2
-        // ...
-        // %xmm2 = copy %xmm0
-        // ...
-        // %xmm2 = copy %xmm9
-        Tracker.clobberRegister(Def, *TRI, *TII, UseCopyInstr);
-        for (const MachineOperand &MO : MI.implicit_operands()) {
-          if (!MO.isReg() || !MO.isDef())
-            continue;
-          MCRegister Reg = MO.getReg().asMCReg();
-          if (!Reg)
-            continue;
-          Tracker.clobberRegister(Reg, *TRI, *TII, UseCopyInstr);
-        }
-
-        Tracker.trackCopy(&MI, *TRI, *TII, UseCopyInstr);
-
-        continue;
       }
     }
 
@@ -977,20 +924,36 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
 
     forwardUses(MI);
 
-    // It's possible that the previous transformation has resulted in a no-op
-    // register move (i.e. one where source and destination registers are the
-    // same and are not referring to a reserved register). If so, delete it.
-    CopyOperands = isCopyInstr(MI, *TII, UseCopyInstr);
-    if (CopyOperands &&
-        CopyOperands->Source->getReg() == CopyOperands->Destination->getReg() &&
-        !MRI->isReserved(CopyOperands->Source->getReg())) {
-      MI.eraseFromParent();
-      NumDeletes++;
+    // Attempt to canonicalize/optimize the instruction now its arguments have
+    // been mutated.  This may convert MI from a non-copy to a copy instruction.
+    if (TII->simplifyInstruction(MI)) {
       Changed = true;
-      continue;
+      LLVM_DEBUG(dbgs() << "MCP: After simplifyInstruction: " << MI);
+    }
+
+    CopyOperands = isCopyInstr(MI, *TII, UseCopyInstr);
+    if (CopyOperands) {
+      Register RegSrc = CopyOperands->Source->getReg();
+      Register RegDef = CopyOperands->Destination->getReg();
+      // It's possible that the previous transformations have resulted in a
+      // no-op register move (i.e. one where source and destination registers
+      // are the same and are not referring to a reserved register). If so,
+      // delete it.
+      if (RegSrc == RegDef && !MRI->isReserved(RegSrc)) {
+        MI.eraseFromParent();
+        NumDeletes++;
+        Changed = true;
+        continue;
+      }
+
+      if (!TRI->regsOverlap(RegDef, RegSrc)) {
+        // Copy is now a candidate for deletion.
+        MCRegister Def = RegDef.asMCReg();
+        if (!MRI->isReserved(Def))
+          MaybeDeadCopies.insert(&MI);
+      }
     }
 
-    // Not a copy.
     SmallVector<Register, 4> Defs;
     const MachineOperand *RegMask = nullptr;
     for (const MachineOperand &MO : MI.operands()) {
@@ -1070,6 +1033,14 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
     // Any previous copy definition or reading the Defs is no longer available.
     for (MCRegister Reg : Defs)
       Tracker.clobberRegister(Reg, *TRI, *TII, UseCopyInstr);
+
+    if (CopyOperands) {
+      Register RegSrc = CopyOperands->Source->getReg();
+      Register RegDef = CopyOperands->Destination->getReg();
+      if (!TRI->regsOverlap(RegDef, RegSrc)) {
+        Tracker.trackCopy(&MI, *TRI, *TII, UseCopyInstr);
+      }
+    }
   }
 
   bool TracksLiveness = MRI->tracksLiveness();
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
index 309ebf71127c4..b24ea9ec1561e 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
@@ -27,24 +27,14 @@ define void @constant_fold_barrier_i128(ptr %p) {
 ; RV32-NEXT:    lw a5, 12(a0)
 ; RV32-NEXT:    slli a1, a1, 11
 ; RV32-NEXT:    and a2, a2, a1
-; RV32-NEXT:    and a3, a3, zero
-; RV32-NEXT:    and a4, a4, zero
-; RV32-NEXT:    and a5, a5, zero
 ; RV32-NEXT:    add a2, a2, a1
-; RV32-NEXT:    add a6, a3, zero
 ; RV32-NEXT:    sltu a1, a2, a1
-; RV32-NEXT:    sltu a3, a3, a3
-; RV32-NEXT:    add a6, a6, a1
-; RV32-NEXT:    seqz a7, a6
+; RV32-NEXT:    mv a6, a1
+; RV32-NEXT:    seqz a7, a1
 ; RV32-NEXT:    and a1, a7, a1
-; RV32-NEXT:    add a7, a4, zero
-; RV32-NEXT:    sltu a4, a4, a4
-; RV32-NEXT:    or a1, a3, a1
-; RV32-NEXT:    add a7, a7, a1
-; RV32-NEXT:    seqz a3, a7
+; RV32-NEXT:    mv a7, a1
+; RV32-NEXT:    seqz a3, a1
 ; RV32-NEXT:    and a1, a3, a1
-; RV32-NEXT:    or a1, a4, a1
-; RV32-NEXT:    add a1, a5, a1
 ; RV32-NEXT:    sw a2, 0(a0)
 ; RV32-NEXT:    sw a6, 4(a0)
 ; RV32-NEXT:    sw a7, 8(a0)
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
index 8e4c0376ad0a5..ec6b6568efa22 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
@@ -21,12 +21,11 @@ define i128 @constant_fold_barrier_i128(i128 %x) {
 ; RV64-LABEL: constant_fold_barrier_i128:
 ; RV64:       # %bb.0: # %entry
 ; RV64-NEXT:    li a2, 1
-; RV64-NEXT:    and a1, a1, zero
 ; RV64-NEXT:    slli a2, a2, 11
 ; RV64-NEXT:    and a0, a0, a2
 ; RV64-NEXT:    add a0, a0, a2
 ; RV64-NEXT:    sltu a2, a0, a2
-; RV64-NEXT:    add a1, a1, a2
+; RV64-NEXT:    mv a1, a2
 ; RV64-NEXT:    ret
 entry:
   %and = and i128 %x, 2048
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
index 9c46e6792e8d8..1028983f960bf 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
@@ -92,34 +92,26 @@ define i64 @udiv64_constant_no_add(i64 %a) nounwind {
 ; RV32-LABEL: udiv64_constant_no_add:
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    lui a2, 838861
-; RV32-NEXT:    mulhu a3, a0, zero
 ; RV32-NEXT:    addi a4, a2, -819
 ; RV32-NEXT:    addi a2, a2, -820
 ; RV32-NEXT:    mul a5, a1, a4
 ; RV32-NEXT:    mul a6, a0, a2
 ; RV32-NEXT:    mulhu a7, a0, a4
-; RV32-NEXT:    mul t0, zero, a4
 ; RV32-NEXT:    mul t1, a1, a2
 ; RV32-NEXT:    mulhu t2, a1, a4
 ; RV32-NEXT:    mulhu a0, a0, a2
 ; RV32-NEXT:    mulhu a1, a1, a2
-; RV32-NEXT:    mul a2, zero, a2
-; RV32-NEXT:    mulhu a4, zero, a4
 ; RV32-NEXT:    add a5, a5, a6
-; RV32-NEXT:    add a2, t0, a2
-; RV32-NEXT:    add t0, t0, t1
-; RV32-NEXT:    add a1, a4, a1
+; RV32-NEXT:    mv t0, t1
 ; RV32-NEXT:    sltu a4, a5, a6
 ; RV32-NEXT:    add a5, a5, a7
-; RV32-NEXT:    sltu a6, t0, t1
-; RV32-NEXT:    sltiu t1, t0, 0
+; RV32-NEXT:    sltu a6, t1, t1
+; RV32-NEXT:    sltiu t1, t1, 0
 ; RV32-NEXT:    add t0, t0, t2
-; RV32-NEXT:    add a1, a2, a1
 ; RV32-NEXT:    sltu a2, a5, a7
 ; RV32-NEXT:    add a6, a6, t1
 ; RV32-NEXT:    sltu a5, t0, t2
 ; RV32-NEXT:    add t0, t0, a0
-; RV32-NEXT:    add a1, a1, a3
 ; RV32-NEXT:    add a2, a4, a2
 ; RV32-NEXT:    add a5, a6, a5
 ; RV32-NEXT:    sltu a0, t0, a0
@@ -156,34 +148,27 @@ define i64 @udiv64_constant_add(i64 %a) nounwind {
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    lui a2, 599186
 ; RV32-NEXT:    lui a3, 149797
-; RV32-NEXT:    mulhu a4, a0, zero
 ; RV32-NEXT:    addi a2, a2, 1171
 ; RV32-NEXT:    addi a3, a3, -1756
 ; RV32-NEXT:    mul a5, a1, a2
 ; RV32-NEXT:    mul a6, a0, a3
 ; RV32-NEXT:    mulhu a7, a0, a2
-; RV32-NEXT:    mul t0, zero, a2
-; RV32-NEXT:    mulhu t1, zero, a2
 ; RV32-NEXT:    mulhu t2, a1, a3
-; RV32-NEXT:    add t1, t1, t2
-; RV32-NEXT:    mul t2, zero, a3
-; RV32-NEXT:    add t2, t0, t2
-; RV32-NEXT:    add t1, t2, t1
+; RV32-NEXT:    mv t1, t2
 ; RV32-NEXT:    mul t2, a1, a3
 ; RV32-NEXT:    mulhu a2, a1, a2
 ; RV32-NEXT:    mulhu a3, a0, a3
 ; RV32-NEXT:    add a5, a5, a6
-; RV32-NEXT:    add t0, t0, t2
+; RV32-NEXT:    mv t0, t2
 ; RV32-NEXT:    sltu a6, a5, a6
 ; RV32-NEXT:    add a5, a5, a7
-; RV32-NEXT:    sltu t2, t0, t2
+; RV32-NEXT:    sltu t2, t2, t2
 ; RV32-NEXT:    sltu a5, a5, a7
 ; RV32-NEXT:    sltiu a7, t0, 0
 ; RV32-NEXT:    add t0, t0, a2
 ; RV32-NEXT:    add a7, t2, a7
 ; RV32-NEXT:    sltu a2, t0, a2
 ; RV32-NEXT:    add t0, t0, a3
-; RV32-NEXT:    add a4, t1, a4
 ; RV32-NEXT:    add a5, a6, a5
 ; RV32-NEXT:    add a2, a7, a2
 ; RV32-NEXT:    sltu a3, t0, a3
@@ -195,7 +180,7 @@ define i64 @udiv64_constant_add(i64 %a) nounwind {
 ; RV32-NEXT:    add a2, a2, a3
 ; RV32-NEXT:    sub a1, a1, a0
 ; RV32-NEXT:    srli a5, a5, 1
-; RV32-NEXT:    add a2, a4, a2
+; RV32-NEXT:    add a2, t1, a2
 ; RV32-NEXT:    sub a1, a1, a2
 ; RV32-NEXT:    slli a0, a1, 31
 ; RV32-NEXT:    srli a1, a1, 1
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
index 0632caecf8907..0b376dd779887 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
@@ -385,7 +385,7 @@ define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
 ; RV32ZBB:       # %bb.0:
 ; RV32ZBB-NEXT:    j .LBB6_2
 ; RV32ZBB-NEXT:  # %bb.1:
-; RV32ZBB-NEXT:    sltiu a0, zero, 0
+; RV32ZBB-NEXT:    li a0, 0
 ; RV32ZBB-NEXT:    ret
 ; RV32ZBB-NEXT:  .LBB6_2:
 ; RV32ZBB-NEXT:    cpop a0, a0
@@ -415,7 +415,7 @@ define i1 @ctpop_i64_ugt_one(i64 %a) nounwind {
 ; RV32ZBB:       # %bb.0:
 ; RV32ZBB-NEXT:    j .LBB7_2
 ; RV32ZBB-NEXT:  # %bb.1:
-; RV32ZBB-NEXT:    snez a0, zero
+; RV32ZBB-NEXT:    li a0, 0
 ; RV32ZBB-NEXT:    ret
 ; RV32ZBB-NEXT:  .LBB7_2:
 ; RV32ZBB-NEXT:    cpop a0, a0
diff --git a/llvm/test/CodeGen/RISCV/machine-copyprop-simplifyinstruction.mir b/llvm/test/CodeGen/RISCV/machine-copyprop-simplifyinstruction.mir
index 15a6d53f343c1..ad17a367316dd 100644
--- a/llvm/test/CodeGen/RISCV/machine-copyprop-simplifyinstruction.mir
+++ b/llvm/test/CodeGen/RISCV/machine-copyprop-simplifyinstruction.mir
@@ -742,3 +742,15 @@ body: |
     renamable $x10 = MAXU renamable $x11, renamable $x11
     PseudoRET implicit $x10
 ...
+---
+name: multipass
+body: |
+  bb.0:
+    ; CHECK-LABEL: name: multipass
+    ; CHECK: renamable $x9 = ADDI $x0, 0
+    ; CHECK-NEXT: PseudoRET implicit $x9
+    renamable $x11 = COPY $x0
+    renamable $x10 = SLLI renamable $x11, 13
+    renamable $x9 = SRLI renamable $x10, 13
+    PseudoRET implicit $x9
+...

@@ -915,51 +907,6 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
// %ecx = COPY %eax
if (eraseIfRedundant(MI, Def, Src) || eraseIfRedundant(MI, Src, Def))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly OT for this review, but I think the eraseIfRedundant case could be removed. The key thing it does which the main copy propagation doesn't is reason about super-registers updating sub-registers in known ways. We could generalize the main pass, and I think this code would become redundant. I may explore this further, or may not.

// Copy is now a candidate for deletion.
MCRegister Def = RegDef.asMCReg();
if (!MRI->isReserved(Def))
MaybeDeadCopies.insert(&MI);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, off topic, possible follow up. Looking at this code, it really looks like we have a nearly generic DCE lurking here. For a general instruction (not just copies), we could track if a def is used. If not and the instruction is safe to delete, we could do so. Not sure how worthwhile this is, but interesting.

@preames
Copy link
Collaborator Author

preames commented May 23, 2025

ping

Copy link
Contributor

@lquinn2015 lquinn2015 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good change let others peek i haven't had time to look to much at LLVM lately but it seems like you are capturing the semantics your describing correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants