Skip to content

X86: Remove hack in shouldRewriteCopySrc for subregister handling #125224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 5, 2025

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Jan 31, 2025

In the problematic situation fixed in 61e556d,
shouldRewriteCopySrc is called with identical register class arguments,
but one has a subregister index. This was very surprising to me,
and it probably shouldn't be valid for it to occur. It happens in cases
with uncoalescable copies where the register class changes, and further
up the chain there is a subregister operand. We could possibly just
skip over uncoalsecable instructions in the chain rather than letting
this query deal with it (or pre-filter the obvious subreg with same
class case).

The generic implementation is supposed to account for checking for
valid subregisters by checking getMatchingSuperRegClass already,
but that was bypassed by the early exit for exact class match.

Also adds a reduced mir test demonstrating the exact problematic
case.

Copy link
Contributor Author

arsenm commented Jan 31, 2025

@arsenm arsenm marked this pull request as ready for review January 31, 2025 13:55
@llvmbot
Copy link
Member

llvmbot commented Jan 31, 2025

@llvm/pr-subscribers-backend-x86

Author: Matt Arsenault (arsenm)

Changes

In the problematic situation fixed in 61e556d,
shouldRewriteCopySrc is called with identical register class arguments,
but one has a subregister index. This was very surprising to me,
and it probably shouldn't be valid for it to occur. It happens in cases
with uncoalescable copies where the register class changes, and further
up the chain there is a subregister operand. We could possibly just
skip over uncoalsecable instructions in the chain rather than letting
this query deal with it (or pre-filter the obvious subreg with same
class case).

The generic implementation is supposed to account for checking for
valid subregisters by checking getMatchingSuperRegClass already,
but that was bypassed by the early exit for exact class match.

Also adds a reduced mir test demonstrating the exact problematic
case.


Full diff: https://github.com/llvm/llvm-project/pull/125224.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/TargetRegisterInfo.cpp (+4-1)
  • (modified) llvm/lib/Target/X86/X86RegisterInfo.cpp (-15)
  • (modified) llvm/lib/Target/X86/X86RegisterInfo.h (-5)
  • (added) llvm/test/CodeGen/X86/pr41619_reduced.mir (+27)
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index 77a4c74f1b38b9..e735c904e1b605 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -420,7 +420,10 @@ static bool shareSameRegisterFile(const TargetRegisterInfo &TRI,
                                   const TargetRegisterClass *SrcRC,
                                   unsigned SrcSubReg) {
   // Same register class.
-  if (DefRC == SrcRC)
+  //
+  // When processing uncoalescable copies / bitcasts, it is possible we reach
+  // here with the same register class, but mismatched subregister indices.
+  if (DefRC == SrcRC && DefSubReg == SrcSubReg)
     return true;
 
   // Both operands are sub registers. Check if they share a register class.
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 4faf8bca4f9e02..af1060519ae5c8 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -224,21 +224,6 @@ X86RegisterInfo::getPointerRegClass(const MachineFunction &MF,
   }
 }
 
-bool X86RegisterInfo::shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-                                           unsigned DefSubReg,
-                                           const TargetRegisterClass *SrcRC,
-                                           unsigned SrcSubReg) const {
-  // Prevent rewriting a copy where the destination size is larger than the
-  // input size. See PR41619.
-  // FIXME: Should this be factored into the base implementation somehow.
-  if (DefRC->hasSuperClassEq(&X86::GR64RegClass) && DefSubReg == 0 &&
-      SrcRC->hasSuperClassEq(&X86::GR64RegClass) && SrcSubReg == X86::sub_32bit)
-    return false;
-
-  return TargetRegisterInfo::shouldRewriteCopySrc(DefRC, DefSubReg,
-                                                  SrcRC, SrcSubReg);
-}
-
 const TargetRegisterClass *
 X86RegisterInfo::getGPRsForTailCall(const MachineFunction &MF) const {
   const Function &F = MF.getFunction();
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.h b/llvm/lib/Target/X86/X86RegisterInfo.h
index 68ee372f27b14d..009d2a8c7ac3a1 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.h
+++ b/llvm/lib/Target/X86/X86RegisterInfo.h
@@ -70,11 +70,6 @@ class X86RegisterInfo final : public X86GenRegisterInfo {
   getLargestLegalSuperClass(const TargetRegisterClass *RC,
                             const MachineFunction &MF) const override;
 
-  bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-                            unsigned DefSubReg,
-                            const TargetRegisterClass *SrcRC,
-                            unsigned SrcSubReg) const override;
-
   /// getPointerRegClass - Returns a TargetRegisterClass used for pointer
   /// values.
   const TargetRegisterClass *
diff --git a/llvm/test/CodeGen/X86/pr41619_reduced.mir b/llvm/test/CodeGen/X86/pr41619_reduced.mir
new file mode 100644
index 00000000000000..5dc2eb60e81411
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr41619_reduced.mir
@@ -0,0 +1,27 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=x86_64-- -mattr=+avx2 -run-pass=peephole-opt -o - %s | FileCheck %s
+
+# When trying to coalesce the operand of VMOVSDto64rr, a query would
+# be made with the same register class but the source has a
+# subregister and the result does not.
+---
+name:            uncoalescable_copy_queries_same_regclass_with_only_one_subreg
+tracksRegLiveness: true
+isSSA:           true
+body:             |
+  bb.0:
+    liveins: $rax
+
+    ; CHECK-LABEL: name: uncoalescable_copy_queries_same_regclass_with_only_one_subreg
+    ; CHECK: liveins: $rax
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr64 = COPY $rax
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:vr128 = COPY [[COPY]].sub_32bit
+    ; CHECK-NEXT: [[VMOVSDto64rr:%[0-9]+]]:gr64 = VMOVSDto64rr [[COPY1]]
+    ; CHECK-NEXT: RET 0, implicit [[VMOVSDto64rr]]
+    %0:gr64 = COPY $rax
+    %1:vr128 = COPY %0.sub_32bit
+    %2:gr64 = VMOVSDto64rr %1
+    RET 0, implicit %2
+
+...

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor Author

arsenm commented Feb 5, 2025

Merge activity

  • Feb 5, 11:06 AM EST: A user started a stack merge that includes this pull request via Graphite.
  • Feb 5, 11:14 AM EST: Graphite rebased this pull request as part of a merge.
  • Feb 5, 11:19 AM EST: Graphite rebased this pull request as part of a merge.
  • Feb 5, 11:23 AM EST: Graphite rebased this pull request as part of a merge.
  • Feb 5, 11:25 AM EST: A user merged this pull request with Graphite.

@arsenm arsenm force-pushed the users/arsenm/x86/remove-shouldRewriteCopySrc-hack branch 2 times, most recently from 033701b to d5712fd Compare February 5, 2025 16:19
In the problematic situation fixed in 61e556d,
shouldRewriteCopySrc is called with identical register class arguments,
but one has a subregister index. This was very surprising to me,
and it probably shouldn't be valid for it to occur. It happens in cases
with uncoalescable copies where the register class changes, and further
up the chain there is a subregister operand. We could possibly just
skip over uncoalsecable instructions in the chain rather than letting
this query deal with it (or pre-filter the obvious subreg with same
class case).

The generic implementation is supposed to account for checking for
valid subregisters by checking getMatchingSuperRegClass already,
but that was bypassed by the early exit for exact class match.

Also adds a reduced mir test demonstrating the exact problematic
case.
@arsenm arsenm force-pushed the users/arsenm/x86/remove-shouldRewriteCopySrc-hack branch from d5712fd to 0abe405 Compare February 5, 2025 16:22
@arsenm arsenm merged commit 92e3cd7 into main Feb 5, 2025
5 of 6 checks passed
@arsenm arsenm deleted the users/arsenm/x86/remove-shouldRewriteCopySrc-hack branch February 5, 2025 16:25
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…vm#125224)

In the problematic situation fixed in 61e556d,
shouldRewriteCopySrc is called with identical register class arguments,
but one has a subregister index. This was very surprising to me,
and it probably shouldn't be valid for it to occur. It happens in cases
with uncoalescable copies where the register class changes, and further
up the chain there is a subregister operand. We could possibly just
skip over uncoalsecable instructions in the chain rather than letting
this query deal with it (or pre-filter the obvious subreg with same
class case).

The generic implementation is supposed to account for checking for
valid subregisters by checking getMatchingSuperRegClass already,
but that was bypassed by the early exit for exact class match.

Also adds a reduced mir test demonstrating the exact problematic
case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants