Skip to content

[CodeGen] commuteInstruction should update implicit-def #131361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sdesmalen-arm
Copy link
Collaborator

When the RegisterCoalescer adds an implicit-def when coalescing
a SUBREG_TO_REG (#123632), this causes issues when removing other
COPY nodes by commuting the instruction because it doesn't take
the implicit-def into consideration. This PR fixes that.

@llvmbot
Copy link
Member

llvmbot commented Mar 14, 2025

@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-x86

Author: Sander de Smalen (sdesmalen-arm)

Changes

When the RegisterCoalescer adds an implicit-def when coalescing
a SUBREG_TO_REG (#123632), this causes issues when removing other
COPY nodes by commuting the instruction because it doesn't take
the implicit-def into consideration. This PR fixes that.


Full diff: https://github.com/llvm/llvm-project/pull/131361.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/TargetInstrInfo.cpp (+21)
  • (added) llvm/test/CodeGen/X86/coalesce-commutative-implicit-def.mir (+37)
diff --git a/llvm/lib/CodeGen/TargetInstrInfo.cpp b/llvm/lib/CodeGen/TargetInstrInfo.cpp
index 7a905b65f26e5..5a1cd5afa7ef6 100644
--- a/llvm/lib/CodeGen/TargetInstrInfo.cpp
+++ b/llvm/lib/CodeGen/TargetInstrInfo.cpp
@@ -200,6 +200,7 @@ MachineInstr *TargetInstrInfo::commuteInstructionImpl(MachineInstr &MI,
       Reg1.isPhysical() ? MI.getOperand(Idx1).isRenamable() : false;
   bool Reg2IsRenamable =
       Reg2.isPhysical() ? MI.getOperand(Idx2).isRenamable() : false;
+
   // If destination is tied to either of the commuted source register, then
   // it must be updated.
   if (HasDef && Reg0 == Reg1 &&
@@ -214,6 +215,24 @@ MachineInstr *TargetInstrInfo::commuteInstructionImpl(MachineInstr &MI,
     SubReg0 = SubReg1;
   }
 
+  // For a case like this:
+  //   %0.sub = INST %0.sub(tied), %1.sub, implicit-def %0
+  // we need to update the implicit-def after commuting to result in:
+  //   %1.sub = INST %1.sub(tied), %0.sub, implicit-def %1
+  SmallVector<unsigned> UpdateImplicitDefIdx;
+  if (HasDef && MI.hasImplicitDef() && MI.getOperand(0).getReg() != Reg0) {
+    const TargetRegisterInfo *TRI =
+        MI.getMF()->getSubtarget().getRegisterInfo();
+    Register OrigReg0 = MI.getOperand(0).getReg();
+    for (auto [OpNo, MO] : llvm::enumerate(MI.implicit_operands())) {
+      Register ImplReg = MO.getReg();
+      if ((ImplReg.isVirtual() && ImplReg == OrigReg0) ||
+          (ImplReg.isPhysical() && OrigReg0.isPhysical() &&
+           TRI->isSubRegisterEq(ImplReg, OrigReg0)))
+        UpdateImplicitDefIdx.push_back(OpNo + MI.getNumExplicitOperands());
+    }
+  }
+
   MachineInstr *CommutedMI = nullptr;
   if (NewMI) {
     // Create a new instruction.
@@ -226,6 +245,8 @@ MachineInstr *TargetInstrInfo::commuteInstructionImpl(MachineInstr &MI,
   if (HasDef) {
     CommutedMI->getOperand(0).setReg(Reg0);
     CommutedMI->getOperand(0).setSubReg(SubReg0);
+    for (unsigned Idx : UpdateImplicitDefIdx)
+      CommutedMI->getOperand(Idx).setReg(Reg0);
   }
   CommutedMI->getOperand(Idx2).setReg(Reg1);
   CommutedMI->getOperand(Idx1).setReg(Reg2);
diff --git a/llvm/test/CodeGen/X86/coalesce-commutative-implicit-def.mir b/llvm/test/CodeGen/X86/coalesce-commutative-implicit-def.mir
new file mode 100644
index 0000000000000..fe1235fe94f85
--- /dev/null
+++ b/llvm/test/CodeGen/X86/coalesce-commutative-implicit-def.mir
@@ -0,0 +1,37 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=x86_64 -run-pass=register-coalescer -o - %s | FileCheck %s
+
+# When the coalescer removes the COPY by commuting the operands of the AND, it should also update the `implicit-def` of the destination register.
+---
+name: implicit_def_dst
+tracksRegLiveness: true
+body: |
+  bb.0:
+    ; CHECK-LABEL: name: implicit_def_dst
+    ; CHECK: [[MOV64rm:%[0-9]+]]:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    ; CHECK-NEXT: [[MOV64rm1:%[0-9]+]]:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    ; CHECK-NEXT: [[MOV64rm:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = AND32rr [[MOV64rm]].sub_32bit, [[MOV64rm1]].sub_32bit, implicit-def dead $eflags, implicit-def [[MOV64rm]]
+    ; CHECK-NEXT: RET 0, implicit [[MOV64rm]]
+    %0:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    %1:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    %1.sub_32bit:gr64_with_sub_8bit = AND32rr %1.sub_32bit:gr64_with_sub_8bit, %0.sub_32bit:gr64_with_sub_8bit, implicit-def dead $eflags, implicit-def %1:gr64_with_sub_8bit
+    %0:gr64_with_sub_8bit = COPY %1:gr64_with_sub_8bit
+    RET 0, implicit %0
+...
+# In case the MIR for some reason contains more than one implicit-def of the destination reg, then all should be updated.
+---
+name: two_implicit_defs_dst
+tracksRegLiveness: true
+body: |
+  bb.0:
+    ; CHECK-LABEL: name: two_implicit_defs_dst
+    ; CHECK: [[MOV64rm:%[0-9]+]]:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    ; CHECK-NEXT: [[MOV64rm1:%[0-9]+]]:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    ; CHECK-NEXT: [[MOV64rm:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = AND32rr [[MOV64rm]].sub_32bit, [[MOV64rm1]].sub_32bit, implicit-def dead $eflags, implicit-def [[MOV64rm]], implicit-def [[MOV64rm]]
+    ; CHECK-NEXT: RET 0, implicit [[MOV64rm]]
+    %0:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    %1:gr64_with_sub_8bit = MOV64rm $noreg, 1, $noreg, 0, $noreg :: (load (s64) from `ptr null`)
+    %1.sub_32bit:gr64_with_sub_8bit = AND32rr %1.sub_32bit:gr64_with_sub_8bit, %0.sub_32bit:gr64_with_sub_8bit, implicit-def dead $eflags, implicit-def %1:gr64_with_sub_8bit, implicit-def %1:gr64_with_sub_8bit
+    %0:gr64_with_sub_8bit = COPY %1:gr64_with_sub_8bit
+    RET 0, implicit %0
+...

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should get rid of subreg_to_reg. This hack is just going to spread everywhere

for (auto [OpNo, MO] : llvm::enumerate(MI.implicit_operands())) {
Register ImplReg = MO.getReg();
if ((ImplReg.isVirtual() && ImplReg == OrigReg0) ||
(ImplReg.isPhysical() && OrigReg0.isPhysical() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The physical register case isn't tested here. However I think it would be best to just ignore it, and just guard against needing to handle it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was actually a SystemZ test that showed a minor regression when I did a return nullptr for this case.

// we need to update the implicit-def after commuting to result in:
// %1.sub = INST %1.sub(tied), %0.sub, implicit-def %1
SmallVector<unsigned> UpdateImplicitDefIdx;
if (HasDef && MI.hasImplicitDef() && MI.getOperand(0).getReg() != Reg0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use MachineInstr::substituteRegister?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that's a lot simpler, thanks for the suggestion!

When the RegisterCoalescer adds an implicit-def when coalescing
a SUBREG_TO_REG (llvm#123632), this causes issues when removing other
COPY nodes by commuting the instruction because it doesn't take
the implicit-def into consideration. This PR fixes that.
Copy link
Collaborator Author

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should get rid of subreg_to_reg. This hack is just going to spread everywhere.

What would be the alternative for subreg-to-reg to describe zeroing of the other lanes in the destination register? To me the hack is more that we want to describe this using a COPY with an implicit-def, especially when the implicit-def is a virtual register. I wonder if we would benefit from having something like:

zero %0.sub32:gpr64 = INST ...

to specify that the other lanes in the register are zeroed (similar to how undef can be used here)

for (auto [OpNo, MO] : llvm::enumerate(MI.implicit_operands())) {
Register ImplReg = MO.getReg();
if ((ImplReg.isVirtual() && ImplReg == OrigReg0) ||
(ImplReg.isPhysical() && OrigReg0.isPhysical() &&
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was actually a SystemZ test that showed a minor regression when I did a return nullptr for this case.

// we need to update the implicit-def after commuting to result in:
// %1.sub = INST %1.sub(tied), %0.sub, implicit-def %1
SmallVector<unsigned> UpdateImplicitDefIdx;
if (HasDef && MI.hasImplicitDef() && MI.getOperand(0).getReg() != Reg0) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that's a lot simpler, thanks for the suggestion!

@sdesmalen-arm sdesmalen-arm force-pushed the commute-implicit-def branch from c98ea57 to e8be9bb Compare April 2, 2025 15:44
Copy link

github-actions bot commented Apr 2, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@sdesmalen-arm
Copy link
Collaborator Author

@arsenm are you happy with the latest change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants