-
Notifications
You must be signed in to change notification settings - Fork 13.5k
RegisterCoalescer: Fix implicit operand handling during rematerialize #75271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
arsenm
merged 5 commits into
llvm:main
from
arsenm:coalescer-set-register-def-read-undef
Dec 13, 2023
Merged
RegisterCoalescer: Fix implicit operand handling during rematerialize #75271
arsenm
merged 5 commits into
llvm:main
from
arsenm:coalescer-set-register-def-read-undef
Dec 13, 2023
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If the copy being hoisted was undef, we have the same problems that eliminateUndefCopy needs to solve. We would effectively be introducing a new live out implicit_def. We need to add an undef flag to avoid artificially introducing a live through undef value. Previously, the verifier would fail due to the dead def inside the loop providing the live in value for the %1 use.
If the rematerialize was placing a subregister into a super register, and implicit operands referenced the original register, we need to add undef flags to the now-subregister indexed implicit operands.
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-regalloc Author: Matt Arsenault (arsenm) ChangesIf the rematerialize was placing a subregister into a super register, Depends #75152 Full diff: https://github.com/llvm/llvm-project/pull/75271.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index c067d87a9fd810..c1af37c8510ff5 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -1201,6 +1201,8 @@ bool RegisterCoalescer::removePartialRedundancy(const CoalescerPair &CP,
<< printMBBReference(MBB) << '\t' << CopyMI);
}
+ const bool IsUndefCopy = CopyMI.getOperand(1).isUndef();
+
// Remove CopyMI.
// Note: This is fine to remove the copy before updating the live-ranges.
// While updating the live-ranges, we only look at slot indices and
@@ -1214,6 +1216,19 @@ bool RegisterCoalescer::removePartialRedundancy(const CoalescerPair &CP,
LIS->pruneValue(*static_cast<LiveRange *>(&IntB), CopyIdx.getRegSlot(),
&EndPoints);
BValNo->markUnused();
+
+ if (IsUndefCopy) {
+ // We're introducing an undef phi def, and need to set undef on any users of
+ // the previously local def to avoid artifically extending the lifetime
+ // through the block.
+ for (MachineOperand &MO : MRI->use_nodbg_operands(IntB.reg())) {
+ const MachineInstr &MI = *MO.getParent();
+ SlotIndex UseIdx = LIS->getInstructionIndex(MI);
+ if (!IntB.liveAt(UseIdx))
+ MO.setIsUndef(true);
+ }
+ }
+
// Extend IntB to the EndPoints of its original live interval.
LIS->extendToIndices(IntB, EndPoints);
@@ -1596,8 +1611,7 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
LR->createDeadDef(NewMIIdx.getRegSlot(), LIS->getVNInfoAllocator());
}
- if (NewMI.getOperand(0).getSubReg())
- NewMI.getOperand(0).setIsUndef();
+ NewMI.setRegisterDefReadUndef(NewMI.getOperand(0).getReg());
// Transfer over implicit operands to the rematerialized instruction.
for (MachineOperand &MO : ImplicitOps)
diff --git a/llvm/test/CodeGen/X86/coalescer-partial-redundancy-clear-dead-flag-undef-copy.mir b/llvm/test/CodeGen/X86/coalescer-partial-redundancy-clear-dead-flag-undef-copy.mir
new file mode 100644
index 00000000000000..5f33be0bc15559
--- /dev/null
+++ b/llvm/test/CodeGen/X86/coalescer-partial-redundancy-clear-dead-flag-undef-copy.mir
@@ -0,0 +1,47 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=x86_64-pc-linux-gnu -run-pass=register-coalescer -verify-coalescing -o - %s | FileCheck %s
+
+# Check for "Live range continues after dead def flag".
+
+# There are 2 copies of undef, but the registers also appear to be
+# live due to block live outs, and thus were not deleted as
+# eliminateUndefCopy only considered the live range, and not the undef
+# flag.
+#
+# removePartialRedundancy would move the COPY undef %0 in bb.1 to
+# bb.0. The live range of %1 would then be extended to be live out of
+# %bb.1 for the backedge phi. This would then fail the verifier, since
+# the dead flag was no longer valid. This was fixed by directly
+# considering the undef flag to avoid considering this special case.
+
+---
+name: partial_redundancy_coalesce_undef_copy_live_out
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: partial_redundancy_coalesce_undef_copy_live_out
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: liveins: $rdi
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $rdi
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: dead [[XOR32ri:%[0-9]+]]:gr32 = XOR32ri undef [[XOR32ri]], 1, implicit-def dead $eflags
+ ; CHECK-NEXT: dead [[MOV32rr:%[0-9]+]]:gr32 = MOV32rr [[COPY]]
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = IMPLICIT_DEF
+ ; CHECK-NEXT: JMP_1 %bb.1
+ bb.0:
+ liveins: $rdi
+
+ %0:gr32 = COPY $rdi
+
+ bb.1:
+ %1:gr32 = COPY undef %0
+ dead %1:gr32 = XOR32ri %1, 1, implicit-def dead $eflags
+ dead %2:gr32 = MOV32rr killed %0
+ %0:gr32 = COPY killed undef %1
+ JMP_1 %bb.1
+
+...
diff --git a/llvm/test/CodeGen/X86/coalescer-remat-with-undef-implicit-def-operand.mir b/llvm/test/CodeGen/X86/coalescer-remat-with-undef-implicit-def-operand.mir
new file mode 100644
index 00000000000000..42d17130412b61
--- /dev/null
+++ b/llvm/test/CodeGen/X86/coalescer-remat-with-undef-implicit-def-operand.mir
@@ -0,0 +1,123 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=x86_64-pc-linux-gnu -verify-coalescing -run-pass=register-coalescer -o - %s | FileCheck %s
+
+# The %1 = MOV32r0 is rematerialized as a subregister of %2. The
+# implicit-def %1 operand needs to have an undef added, just like the
+# main result operand.
+
+---
+name: remat_into_subregister_set_undef_implicit_operand_subregisters
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: remat_into_subregister_set_undef_implicit_operand_subregisters
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: liveins: $rdi
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: undef [[MOV32r0_:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = MOV32r0 implicit-def dead $eflags, implicit-def [[MOV32r0_]]
+ ; CHECK-NEXT: [[MOV32r0_1:%[0-9]+]]:gr32 = MOV32r0 implicit-def dead $eflags, implicit-def [[MOV32r0_1]]
+ ; CHECK-NEXT: undef [[MOV32r0_2:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = MOV32r0 implicit-def dead $eflags, implicit-def undef [[MOV32r0_2]].sub_32bit, implicit-def [[MOV32r0_2]]
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[MOV32r0_2:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = XOR32ri [[MOV32r0_2]].sub_32bit, 1, implicit-def dead $eflags
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: successors: %bb.4(0x40000000), %bb.3(0x40000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: JCC_1 %bb.4, 5, implicit killed undef $eflags
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.3:
+ ; CHECK-NEXT: successors: %bb.4(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.4:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: dead [[MOV32rr:%[0-9]+]]:gr32 = MOV32rr [[MOV32r0_1]]
+ ; CHECK-NEXT: dead [[SHL64ri:%[0-9]+]]:gr64_nosp = SHL64ri [[MOV32r0_]], 4, implicit-def dead $eflags
+ ; CHECK-NEXT: [[MOV32r0_1:%[0-9]+]]:gr32 = COPY [[MOV32r0_2]].sub_32bit
+ ; CHECK-NEXT: JMP_1 %bb.1
+ bb.0:
+ liveins: $rdi
+
+ undef %0.sub_32bit:gr64_with_sub_8bit = MOV32r0 implicit-def dead $eflags, implicit-def %0
+ %1:gr32 = MOV32r0 implicit-def dead $eflags, implicit-def %1
+ undef %2.sub_32bit:gr64_with_sub_8bit = COPY %1, implicit-def %2
+
+ bb.1:
+ %2.sub_32bit:gr64_with_sub_8bit = XOR32ri %2.sub_32bit, 1, implicit-def dead $eflags
+
+ bb.2:
+ JCC_1 %bb.4, 5, implicit killed undef $eflags
+
+ bb.3:
+
+ bb.4:
+ dead %3:gr32 = MOV32rr %1
+ dead %4:gr64_nosp = SHL64ri %0, 4, implicit-def dead $eflags
+ %1:gr32 = COPY %2.sub_32bit
+ JMP_1 %bb.1
+
+...
+
+# Same, except the implicit-def on the original instruction already
+# has a subregister index.
+
+---
+name: remat_into_subregister_set_undef_implicit_operand_subregisters_with_subreg
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: remat_into_subregister_set_undef_implicit_operand_subregisters_with_subreg
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: liveins: $rdi
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: undef [[MOV32r0_:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = MOV32r0 implicit-def dead $eflags, implicit-def [[MOV32r0_]]
+ ; CHECK-NEXT: [[MOV32r0_1:%[0-9]+]]:gr32 = MOV32r0 implicit-def dead $eflags, implicit-def undef [[MOV32r0_1]].sub_8bit
+ ; CHECK-NEXT: undef [[MOV32r0_2:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = MOV32r0 implicit-def dead $eflags, implicit-def undef [[MOV32r0_2]].sub_8bit, implicit-def [[MOV32r0_2]]
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[MOV32r0_2:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = XOR32ri [[MOV32r0_2]].sub_32bit, 1, implicit-def dead $eflags
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: successors: %bb.4(0x40000000), %bb.3(0x40000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: JCC_1 %bb.4, 5, implicit killed undef $eflags
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.3:
+ ; CHECK-NEXT: successors: %bb.4(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.4:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: dead [[MOV32rr:%[0-9]+]]:gr32 = MOV32rr [[MOV32r0_1]]
+ ; CHECK-NEXT: dead [[SHL64ri:%[0-9]+]]:gr64_nosp = SHL64ri [[MOV32r0_]], 4, implicit-def dead $eflags
+ ; CHECK-NEXT: [[MOV32r0_1:%[0-9]+]]:gr32 = COPY [[MOV32r0_2]].sub_32bit
+ ; CHECK-NEXT: JMP_1 %bb.1
+ bb.0:
+ liveins: $rdi
+
+ undef %0.sub_32bit:gr64_with_sub_8bit = MOV32r0 implicit-def dead $eflags, implicit-def %0
+ %1:gr32 = MOV32r0 implicit-def dead $eflags, undef implicit-def %1.sub_8bit
+ undef %2.sub_32bit:gr64_with_sub_8bit = COPY %1, implicit-def %2
+
+ bb.1:
+ %2.sub_32bit:gr64_with_sub_8bit = XOR32ri %2.sub_32bit, 1, implicit-def dead $eflags
+
+ bb.2:
+ JCC_1 %bb.4, 5, implicit killed undef $eflags
+
+ bb.3:
+
+ bb.4:
+ dead %3:gr32 = MOV32rr %1
+ dead %4:gr64_nosp = SHL64ri %0, 4, implicit-def dead $eflags
+ %1:gr32 = COPY %2.sub_32bit
+ JMP_1 %bb.1
+
+...
|
qcolombet
approved these changes
Dec 13, 2023
This was referenced Dec 15, 2023
bzEq
pushed a commit
to bzEq/llvm-project
that referenced
this pull request
Dec 25, 2023
Demonstrate `IMPLICIT_DEF implicit-def ...` can be generated after coalescing on PPC.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If the rematerialize was placing a subregister into a super register,
and implicit operands referenced the original register, we need to add
undef flags to the now-subregister indexed implicit operands.
Depends #75152