-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[InitUndef] Don't use largest super class #107885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-arm @llvm/pr-subscribers-backend-risc-v Author: Nikita Popov (nikic) ChangesThe InitUndef pass currently uses the getLargestSuperClass() hook (which is only used by that pass) to chose the initialized register. From the implementation, I didn't really get why this is being done (or even correct to extend the register class from the original in this way). My suspicion is that this was just to avoid adding extra pseudos for register classes like vrnov0 vs vr, but after #106744 this is not necessary anymore. Is there some other reason for this? Patch is 21.25 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107885.diff 5 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
index 197f66e8659d55..ebf06bc57948f2 100644
--- a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
@@ -1204,15 +1204,6 @@ class TargetRegisterInfo : public MCRegisterInfo {
return false;
}
- /// Returns the Largest Super Class that is being initialized. There
- /// should be a Pseudo Instruction implemented for the super class
- /// that is being returned to ensure that Init Undef can apply the
- /// initialization correctly.
- virtual const TargetRegisterClass *
- getLargestSuperClass(const TargetRegisterClass *RC) const {
- llvm_unreachable("Unexpected target register class.");
- }
-
/// Returns if the architecture being targeted has the required Pseudo
/// Instructions for initializing the register. By default this returns false,
/// but where it is overriden for an architecture, the behaviour will be
diff --git a/llvm/lib/CodeGen/InitUndef.cpp b/llvm/lib/CodeGen/InitUndef.cpp
index 8d20f2668de6b9..1613e413712d26 100644
--- a/llvm/lib/CodeGen/InitUndef.cpp
+++ b/llvm/lib/CodeGen/InitUndef.cpp
@@ -152,8 +152,7 @@ bool InitUndef::handleSubReg(MachineFunction &MF, MachineInstr &MI,
if (Info.UsedLanes == Info.DefinedLanes)
continue;
- const TargetRegisterClass *TargetRegClass =
- TRI->getLargestSuperClass(MRI->getRegClass(Reg));
+ const TargetRegisterClass *TargetRegClass = MRI->getRegClass(Reg);
LaneBitmask NeedDef = Info.UsedLanes & ~Info.DefinedLanes;
@@ -172,8 +171,8 @@ bool InitUndef::handleSubReg(MachineFunction &MF, MachineInstr &MI,
Register LatestReg = Reg;
for (auto ind : SubRegIndexNeedInsert) {
Changed = true;
- const TargetRegisterClass *SubRegClass = TRI->getLargestSuperClass(
- TRI->getSubRegisterClass(TargetRegClass, ind));
+ const TargetRegisterClass *SubRegClass =
+ TRI->getSubRegisterClass(TargetRegClass, ind);
Register TmpInitSubReg = MRI->createVirtualRegister(SubRegClass);
LLVM_DEBUG(dbgs() << "Register Class ID" << SubRegClass->getID() << "\n");
BuildMI(*MI.getParent(), &MI, MI.getDebugLoc(),
@@ -199,8 +198,7 @@ bool InitUndef::fixupIllOperand(MachineInstr *MI, MachineOperand &MO) {
dbgs() << "Emitting PseudoInitUndef Instruction for implicit register "
<< printReg(MO.getReg()) << '\n');
- const TargetRegisterClass *TargetRegClass =
- TRI->getLargestSuperClass(MRI->getRegClass(MO.getReg()));
+ const TargetRegisterClass *TargetRegClass = MRI->getRegClass(MO.getReg());
LLVM_DEBUG(dbgs() << "Register Class ID" << TargetRegClass->getID() << "\n");
Register NewReg = MRI->createVirtualRegister(TargetRegClass);
BuildMI(*MI->getParent(), MI, MI->getDebugLoc(),
diff --git a/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h b/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h
index 53803cff8b90ac..58b5e98fd30b14 100644
--- a/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h
+++ b/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h
@@ -241,19 +241,6 @@ class ARMBaseRegisterInfo : public ARMGenRegisterInfo {
int getSEHRegNum(unsigned i) const { return getEncodingValue(i); }
- const TargetRegisterClass *
- getLargestSuperClass(const TargetRegisterClass *RC) const override {
- if (ARM::MQPRRegClass.hasSubClassEq(RC))
- return &ARM::MQPRRegClass;
- if (ARM::SPRRegClass.hasSubClassEq(RC))
- return &ARM::SPRRegClass;
- if (ARM::DPR_VFP2RegClass.hasSubClassEq(RC))
- return &ARM::DPR_VFP2RegClass;
- if (ARM::GPRRegClass.hasSubClassEq(RC))
- return &ARM::GPRRegClass;
- return RC;
- }
-
bool doesRegClassHavePseudoInitUndef(
const TargetRegisterClass *RC) const override {
(void)RC;
diff --git a/llvm/lib/Target/RISCV/RISCVRegisterInfo.h b/llvm/lib/Target/RISCV/RISCVRegisterInfo.h
index 7e04e9154b524e..98a712af085399 100644
--- a/llvm/lib/Target/RISCV/RISCVRegisterInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVRegisterInfo.h
@@ -130,19 +130,6 @@ struct RISCVRegisterInfo : public RISCVGenRegisterInfo {
const MachineFunction &MF, const VirtRegMap *VRM,
const LiveRegMatrix *Matrix) const override;
- const TargetRegisterClass *
- getLargestSuperClass(const TargetRegisterClass *RC) const override {
- if (RISCV::VRM8RegClass.hasSubClassEq(RC))
- return &RISCV::VRM8RegClass;
- if (RISCV::VRM4RegClass.hasSubClassEq(RC))
- return &RISCV::VRM4RegClass;
- if (RISCV::VRM2RegClass.hasSubClassEq(RC))
- return &RISCV::VRM2RegClass;
- if (RISCV::VRRegClass.hasSubClassEq(RC))
- return &RISCV::VRRegClass;
- return RC;
- }
-
bool doesRegClassHavePseudoInitUndef(
const TargetRegisterClass *RC) const override {
return isVRRegClass(RC);
diff --git a/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir b/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir
index be6ed4d2a6aa14..ed274cf49fa9bc 100644
--- a/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir
@@ -14,9 +14,9 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_0
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_1
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_1
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -52,7 +52,7 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_1
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_1
; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_0
@@ -92,7 +92,7 @@ body: |
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_0
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_3
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -130,7 +130,7 @@ body: |
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_0
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_2
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -166,7 +166,7 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M2_]], %subreg.sub_vrm2_0
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_1
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG1]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -239,11 +239,11 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm8 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_0
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_1
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_1
- ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_1
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG3]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -279,9 +279,9 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm8 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_1
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_1
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_1
; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_0
@@ -319,11 +319,11 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm8 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_2
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_1
; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_0
- ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_3
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG3]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -359,11 +359,11 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm8 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_3
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_1
; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_0
- ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_2
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG3]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -401,9 +401,9 @@ body: |
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_0
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_3
- ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_5
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG3]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -441,9 +441,9 @@ body: |
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_0
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_3
- ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_4
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG3]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -481,9 +481,9 @@ body: |
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_0
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_2
- ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_7
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG3]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -521,9 +521,9 @@ body: |
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_0
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_2
- ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vrnov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_6
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG3]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -559,9 +559,9 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm8 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M2_]], %subreg.sub_vrm2_0
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_1
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_1
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -597,7 +597,7 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm8 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M2_]], %subreg.sub_vrm2_1
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
- ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_1
; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_0
@@ -637,7 +637,7 @@ body: |
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_0
- ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2nov0 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_3
; CHECK-NEXT: early-clobber %6:vrm8 = PseudoVRGATHER_VI_M8 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
@@ -675,7 +675,7 @@ body: |
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], ...
[truncated]
|
@@ -14,9 +14,9 @@ body: | | |||
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_0 | |||
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype | |||
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF | |||
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF | |||
; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2nov0 = INIT_UNDEF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are the nov0 register classes coming from in this test case? None of the instructions below are masked so I would have thought that MRI->getRegClass(MO.getReg());
would have returned vrm2/vr anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRI->getSubRegisterClass(VRM4, sub_vrm2_0)
gives VRM2, but TRI->getSubRegisterClass(VRM4, sub_vrm2_1)
is VRM2NoV0 for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this weird as well, but I think the reason is that if we're allocating consecutive registers, then only the bottom one can be v0, so the rest end up with a nov0 regclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess it's trying to find the smallest register class. Which now that I think about it, a vrm2nov0 should be fine if it's always going to be inserted into the upper half of a vrm4, because as you say it can never be v0. So the RISC-V changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RISC-V changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial idea to identify the largest class is to map the various register classes to four init pseudos.
LGTM on removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Changes LGTM
The InitUndef pass works around a register allocation issue, where undef operands can be allocated to the same register as early-clobber result operands. This may lead to ISA constraint violations, where certain input and output registers are not allowed to overlap. Originally this pass was implemented for RISCV, and then extended to ARM in llvm#77770. I've since removed the target-specific parts of the pass in llvm#106744 and llvm#107885. This PR now enables the pass for all targets. The motivating case is the one in arm64-ldxr-stxr.ll for the AArch64 target, where we were previously incorrectly allocating a stxp input and output to the same register.
The InitUndef pass works around a register allocation issue, where undef operands can be allocated to the same register as early-clobber result operands. This may lead to ISA constraint violations, where certain input and output registers are not allowed to overlap. Originally this pass was implemented for RISCV, and then extended to ARM in #77770. I've since removed the target-specific parts of the pass in #106744 and #107885. This PR reduces the pass to use a single requiresDisjointEarlyClobberAndUndef() target hook and enables it by default. The hook is disabled for AMDGPU, because overlapping early-clobber and undef operands are known to be safe for that target, and we get significant codegen diffs otherwise. The motivating case is the one in arm64-ldxr-stxr.ll, where we were previously incorrectly allocating a stxp input and output to the same register.
The InitUndef pass currently uses the getLargestSuperClass() hook (which is only used by that pass) to chose the register to initialize. From the implementation, I didn't really get why this is being done (or even correct to extend the register class from the original in this way). My suspicion is that this was just to avoid adding extra pseudos for register classes like vrnov0 vs vr, but after #106744 this is not necessary anymore.
Is there some other reason for this?