-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[CodeGen] Add generic INIT_UNDEF pseudo #106744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-backend-arm Author: Nikita Popov (nikic) ChangesThe InitUndef pass currently uses target-specific pseudo instructions, with one pseudo per register class. Instead, add a generic pseudo instruction, which can be used by all targets and register classes. Patch is 44.43 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106744.diff 16 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 49ce13dd8cbe39..65c5788ac5cc9f 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2278,15 +2278,6 @@ class TargetInstrInfo : public MCInstrInfo {
llvm_unreachable("unknown number of operands necessary");
}
- /// Gets the opcode for the Pseudo Instruction used to initialize
- /// the undef value. If no Instruction is available, this will
- /// fail compilation.
- virtual unsigned getUndefInitOpcode(unsigned RegClassID) const {
- (void)RegClassID;
-
- llvm_unreachable("Unexpected register class.");
- }
-
private:
mutable std::unique_ptr<MIRFormatter> Formatter;
unsigned CallFrameSetupOpcode, CallFrameDestroyOpcode;
diff --git a/llvm/include/llvm/Support/TargetOpcodes.def b/llvm/include/llvm/Support/TargetOpcodes.def
index 635c265a433631..e1883de0c93b4c 100644
--- a/llvm/include/llvm/Support/TargetOpcodes.def
+++ b/llvm/include/llvm/Support/TargetOpcodes.def
@@ -56,6 +56,11 @@ HANDLE_TARGET_OPCODE(INSERT_SUBREG)
/// IMPLICIT_DEF - This is the MachineInstr-level equivalent of undef.
HANDLE_TARGET_OPCODE(IMPLICIT_DEF)
+/// Explicit undef initialization used past IMPLICIT_DEF elimination in cases
+/// where an undef operand must be allocated to a different register than an
+/// early-clobber result operand.
+HANDLE_TARGET_OPCODE(INIT_UNDEF)
+
/// SUBREG_TO_REG - Assert the value of bits in a super register.
/// The result of this instruction is the value of the second operand inserted
/// into the subregister specified by the third operand. All other bits are
diff --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index b2eb250ae60b60..3e037affe1cfd2 100644
--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -1254,6 +1254,13 @@ def IMPLICIT_DEF : StandardPseudoInstruction {
let isAsCheapAsAMove = true;
let isMeta = true;
}
+def INIT_UNDEF : StandardPseudoInstruction {
+ let OutOperandList = (outs unknown:$dst);
+ let InOperandList = (ins);
+ let AsmString = "";
+ let hasSideEffects = false;
+ let Size = 0;
+}
def SUBREG_TO_REG : StandardPseudoInstruction {
let OutOperandList = (outs unknown:$dst);
let InOperandList = (ins unknown:$implsrc, unknown:$subsrc, i32imm:$subidx);
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 19d23c8ba96783..88e9b9d27d3f27 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1832,6 +1832,10 @@ void AsmPrinter::emitFunctionBody() {
// This instruction is only used to note jump table debug info, it's
// purely meta information.
break;
+ case TargetOpcode::INIT_UNDEF:
+ // This is only used to influence register allocation behavior, no
+ // actual initialization is needed.
+ break;
default:
emitInstruction(&MI);
if (CanDoExtraAnalysis) {
diff --git a/llvm/lib/CodeGen/InitUndef.cpp b/llvm/lib/CodeGen/InitUndef.cpp
index 51c50ff872ef21..533406b11c83bb 100644
--- a/llvm/lib/CodeGen/InitUndef.cpp
+++ b/llvm/lib/CodeGen/InitUndef.cpp
@@ -177,8 +177,7 @@ bool InitUndef::handleSubReg(MachineFunction &MF, MachineInstr &MI,
Register TmpInitSubReg = MRI->createVirtualRegister(SubRegClass);
LLVM_DEBUG(dbgs() << "Register Class ID" << SubRegClass->getID() << "\n");
BuildMI(*MI.getParent(), &MI, MI.getDebugLoc(),
- TII->get(TII->getUndefInitOpcode(SubRegClass->getID())),
- TmpInitSubReg);
+ TII->get(TargetOpcode::INIT_UNDEF), TmpInitSubReg);
Register NewReg = MRI->createVirtualRegister(TargetRegClass);
BuildMI(*MI.getParent(), &MI, MI.getDebugLoc(),
TII->get(TargetOpcode::INSERT_SUBREG), NewReg)
@@ -203,9 +202,9 @@ bool InitUndef::fixupIllOperand(MachineInstr *MI, MachineOperand &MO) {
const TargetRegisterClass *TargetRegClass =
TRI->getLargestSuperClass(MRI->getRegClass(MO.getReg()));
LLVM_DEBUG(dbgs() << "Register Class ID" << TargetRegClass->getID() << "\n");
- unsigned Opcode = TII->getUndefInitOpcode(TargetRegClass->getID());
Register NewReg = MRI->createVirtualRegister(TargetRegClass);
- BuildMI(*MI->getParent(), MI, MI->getDebugLoc(), TII->get(Opcode), NewReg);
+ BuildMI(*MI->getParent(), MI, MI->getDebugLoc(),
+ TII->get(TargetOpcode::INIT_UNDEF), NewReg);
MO.setReg(NewReg);
if (MO.isUndef())
MO.setIsUndef(false);
diff --git a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
index 8eb5d91d3b8792..710182985a1e9e 100644
--- a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
+++ b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
@@ -2411,12 +2411,6 @@ void ARMAsmPrinter::emitInstruction(const MachineInstr *MI) {
case ARM::SEH_EpilogEnd:
ATS.emitARMWinCFIEpilogEnd();
return;
-
- case ARM::PseudoARMInitUndefMQPR:
- case ARM::PseudoARMInitUndefSPR:
- case ARM::PseudoARMInitUndefDPR_VFP2:
- case ARM::PseudoARMInitUndefGPR:
- return;
}
MCInst TmpInst;
diff --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
index 27290f7f76347c..aee9797585dbd2 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
@@ -546,19 +546,6 @@ class ARMBaseInstrInfo : public ARMGenInstrInfo {
std::optional<RegImmPair> isAddImmediate(const MachineInstr &MI,
Register Reg) const override;
-
- unsigned getUndefInitOpcode(unsigned RegClassID) const override {
- if (RegClassID == ARM::MQPRRegClass.getID())
- return ARM::PseudoARMInitUndefMQPR;
- if (RegClassID == ARM::SPRRegClass.getID())
- return ARM::PseudoARMInitUndefSPR;
- if (RegClassID == ARM::DPR_VFP2RegClass.getID())
- return ARM::PseudoARMInitUndefDPR_VFP2;
- if (RegClassID == ARM::GPRRegClass.getID())
- return ARM::PseudoARMInitUndefGPR;
-
- llvm_unreachable("Unexpected register class.");
- }
};
/// Get the operands corresponding to the given \p Pred value. By default, the
diff --git a/llvm/lib/Target/ARM/ARMInstrInfo.td b/llvm/lib/Target/ARM/ARMInstrInfo.td
index 26f7d70b43b262..ed80a2108f5962 100644
--- a/llvm/lib/Target/ARM/ARMInstrInfo.td
+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -6536,15 +6536,3 @@ let isPseudo = 1 in {
let isTerminator = 1 in
def SEH_EpilogEnd : PseudoInst<(outs), (ins), NoItinerary, []>, Sched<[]>;
}
-
-
-//===----------------------------------------------------------------------===//
-// Pseudo Instructions for use when early-clobber is defined and Greedy Register
-// Allocation is used. This ensures the constraint is used properly.
-//===----------------------------------------------------------------------===//
-let isCodeGenOnly = 1, hasNoSchedulingInfo = 1 in {
- def PseudoARMInitUndefMQPR : PseudoInst<(outs MQPR:$vd), (ins), NoItinerary, []>;
- def PseudoARMInitUndefSPR : PseudoInst<(outs SPR:$sd), (ins), NoItinerary, []>;
- def PseudoARMInitUndefDPR_VFP2 : PseudoInst<(outs DPR_VFP2:$dd), (ins), NoItinerary, []>;
- def PseudoARMInitUndefGPR : PseudoInst<(outs GPR:$rd), (ins), NoItinerary, []>;
-}
diff --git a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
index 476dde2be39e57..24bca2da652d0e 100644
--- a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
+++ b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
@@ -303,11 +303,6 @@ void RISCVAsmPrinter::emitInstruction(const MachineInstr *MI) {
case RISCV::KCFI_CHECK:
LowerKCFI_CHECK(*MI);
return;
- case RISCV::PseudoRVVInitUndefM1:
- case RISCV::PseudoRVVInitUndefM2:
- case RISCV::PseudoRVVInitUndefM4:
- case RISCV::PseudoRVVInitUndefM8:
- return;
case TargetOpcode::STACKMAP:
return LowerSTACKMAP(*OutStreamer, SM, *MI);
case TargetOpcode::PATCHPOINT:
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.h b/llvm/lib/Target/RISCV/RISCVInstrInfo.h
index 8494110adffb94..5875d7072546af 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.h
@@ -293,21 +293,6 @@ class RISCVInstrInfo : public RISCVGenInstrInfo {
unsigned getTailDuplicateSize(CodeGenOptLevel OptLevel) const override;
- unsigned getUndefInitOpcode(unsigned RegClassID) const override {
- switch (RegClassID) {
- case RISCV::VRRegClassID:
- return RISCV::PseudoRVVInitUndefM1;
- case RISCV::VRM2RegClassID:
- return RISCV::PseudoRVVInitUndefM2;
- case RISCV::VRM4RegClassID:
- return RISCV::PseudoRVVInitUndefM4;
- case RISCV::VRM8RegClassID:
- return RISCV::PseudoRVVInitUndefM8;
- default:
- llvm_unreachable("Unexpected register class.");
- }
- }
-
protected:
const RISCVSubtarget &STI;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
index 1b4303fbbcf809..c91c9c3614a34c 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
@@ -6116,15 +6116,6 @@ foreach lmul = MxList in {
}
}
-/// Empty pseudo for RISCVInitUndefPass
-let hasSideEffects = 0, mayLoad = 0, mayStore = 0, Size = 0,
- isCodeGenOnly = 1 in {
- def PseudoRVVInitUndefM1 : Pseudo<(outs VR:$vd), (ins), [], "">;
- def PseudoRVVInitUndefM2 : Pseudo<(outs VRM2:$vd), (ins), [], "">;
- def PseudoRVVInitUndefM4 : Pseudo<(outs VRM4:$vd), (ins), [], "">;
- def PseudoRVVInitUndefM8 : Pseudo<(outs VRM8:$vd), (ins), [], "">;
-}
-
//===----------------------------------------------------------------------===//
// 6. Configuration-Setting Instructions
//===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/RISCV/rvv/handle-noreg-with-implicit-def.mir b/llvm/test/CodeGen/RISCV/rvv/handle-noreg-with-implicit-def.mir
index e090b313d4f7b8..7b4d200ef8a3b0 100644
--- a/llvm/test/CodeGen/RISCV/rvv/handle-noreg-with-implicit-def.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/handle-noreg-with-implicit-def.mir
@@ -9,8 +9,8 @@ body: |
; MIR-LABEL: name: vrgather_all_undef
; MIR: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
; MIR-NEXT: [[DEF1:%[0-9]+]]:vr = IMPLICIT_DEF
- ; MIR-NEXT: [[PseudoRVVInitUndefM1_:%[0-9]+]]:vr = PseudoRVVInitUndefM1
- ; MIR-NEXT: early-clobber %1:vr = PseudoVRGATHER_VI_M1 [[DEF1]], killed [[PseudoRVVInitUndefM1_]], 0, 0, 5 /* e32 */, 0 /* tu, mu */
+ ; MIR-NEXT: [[INIT_UNDEF:%[0-9]+]]:vr = INIT_UNDEF
+ ; MIR-NEXT: early-clobber %1:vr = PseudoVRGATHER_VI_M1 [[DEF1]], killed [[INIT_UNDEF]], 0, 0, 5 /* e32 */, 0 /* tu, mu */
; MIR-NEXT: $v8 = COPY %1
; MIR-NEXT: PseudoRET implicit $v8
%2:vr = IMPLICIT_DEF
diff --git a/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir b/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir
index 539d319f3426dd..be6ed4d2a6aa14 100644
--- a/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir
@@ -14,10 +14,10 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_0
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[PseudoRVVInitUndefM2_:%[0-9]+]]:vrm2 = PseudoRVVInitUndefM2
- ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[PseudoRVVInitUndefM2_]], %subreg.sub_vrm2_1
- ; CHECK-NEXT: [[PseudoRVVInitUndefM1_:%[0-9]+]]:vr = PseudoRVVInitUndefM1
- ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[PseudoRVVInitUndefM1_]], %subreg.sub_vrm1_1
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_1
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_1
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: PseudoVSE32_V_M4 killed %6, killed [[ADDI1]], 0, 5 /* e32 */, implicit $vl, implicit $vtype
@@ -52,10 +52,10 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_1
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[PseudoRVVInitUndefM2_:%[0-9]+]]:vrm2 = PseudoRVVInitUndefM2
- ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[PseudoRVVInitUndefM2_]], %subreg.sub_vrm2_1
- ; CHECK-NEXT: [[PseudoRVVInitUndefM1_:%[0-9]+]]:vr = PseudoRVVInitUndefM1
- ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[PseudoRVVInitUndefM1_]], %subreg.sub_vrm1_0
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_1
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_0
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: PseudoVSE32_V_M4 killed %6, killed [[ADDI1]], 0, 5 /* e32 */, implicit $vl, implicit $vtype
@@ -90,10 +90,10 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_2
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[PseudoRVVInitUndefM2_:%[0-9]+]]:vrm2 = PseudoRVVInitUndefM2
- ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[PseudoRVVInitUndefM2_]], %subreg.sub_vrm2_0
- ; CHECK-NEXT: [[PseudoRVVInitUndefM1_:%[0-9]+]]:vr = PseudoRVVInitUndefM1
- ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[PseudoRVVInitUndefM1_]], %subreg.sub_vrm1_3
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_0
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_3
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: PseudoVSE32_V_M4 killed %6, killed [[ADDI1]], 0, 5 /* e32 */, implicit $vl, implicit $vtype
@@ -128,10 +128,10 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_3
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[PseudoRVVInitUndefM2_:%[0-9]+]]:vrm2 = PseudoRVVInitUndefM2
- ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[PseudoRVVInitUndefM2_]], %subreg.sub_vrm2_0
- ; CHECK-NEXT: [[PseudoRVVInitUndefM1_:%[0-9]+]]:vr = PseudoRVVInitUndefM1
- ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[PseudoRVVInitUndefM1_]], %subreg.sub_vrm1_2
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_0
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm1_2
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG2]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: PseudoVSE32_V_M4 killed %6, killed [[ADDI1]], 0, 5 /* e32 */, implicit $vl, implicit $vtype
@@ -166,8 +166,8 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M2_]], %subreg.sub_vrm2_0
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[PseudoRVVInitUndefM2_:%[0-9]+]]:vrm2 = PseudoRVVInitUndefM2
- ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[PseudoRVVInitUndefM2_]], %subreg.sub_vrm2_1
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_1
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG1]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: PseudoVSE32_V_M4 killed %6, killed [[ADDI1]], 0, 5 /* e32 */, implicit $vl, implicit $vtype
@@ -202,8 +202,8 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm4 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M2_]], %subreg.sub_vrm2_1
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm4 = IMPLICIT_DEF
- ; CHECK-NEXT: [[PseudoRVVInitUndefM2_:%[0-9]+]]:vrm2 = PseudoRVVInitUndefM2
- ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[PseudoRVVInitUndefM2_]], %subreg.sub_vrm2_0
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm4 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm2_0
; CHECK-NEXT: early-clobber %6:vrm4 = PseudoVRGATHER_VI_M4 %pt2, killed [[INSERT_SUBREG1]], 0, 0, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: PseudoVSE32_V_M4 killed %6, killed [[ADDI1]], 0, 5 /* e32 */, implicit $vl, implicit $vtype
@@ -239,12 +239,12 @@ body: |
; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:vrm8 = INSERT_SUBREG [[DEF]], [[PseudoVLE32_V_M1_]], %subreg.sub_vrm1_0
; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 0, 210 /* e32, m4, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: %pt2:vrm8 = IMPLICIT_DEF
- ; CHECK-NEXT: [[PseudoRVVInitUndefM4_:%[0-9]+]]:vrm4 = PseudoRVVInitUndefM4
- ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[PseudoRVVInitUndefM4_]], %subreg.sub_vrm4_1
- ; CHECK-NEXT: [[PseudoRVVInitUndefM2_:%[0-9]+]]:vrm2 = PseudoRVVInitUndefM2
- ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[PseudoRVVInitUndefM2_]], %subreg.sub_vrm2_1
- ; CHECK-NEXT: [[PseudoRVVInitUndefM1_:%[0-9]+]]:vr = PseudoRVVInitUndefM1
- ; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[PseudoRVVInitUndefM1_]], %subreg.sub_vrm1_1
+ ; CHECK-NEXT: [[INIT_UNDEF:%[0-9]+]]:vrm4 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG]], [[INIT_UNDEF]], %subreg.sub_vrm4_1
+ ; CHECK-NEXT: [[INIT_UNDEF1:%[0-9]+]]:vrm2 = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG1]], [[INIT_UNDEF1]], %subreg.sub_vrm2_1
+ ; CHECK-NEXT: [[INIT_UNDEF2:%[0-9]+]]:vr = INIT_UNDEF
+ ; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:vrm8 = INSERT_SUBREG [[INSERT_SUBREG2]], [[INIT_UNDEF2]], %subreg.sub_vrm1_1
; CHECK-NEXT: e...
[truncated]
|
let InOperandList = (ins); | ||
let AsmString = ""; | ||
let hasSideEffects = false; | ||
let Size = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially also added isMeta = true
here like IMPLICIT_DEF does, but this caused more (positive looking) changes in riscv, so I left it out again.
Disclaimer: I don't really know what I'm doing here... or what exactly the effect of isMeta is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also wondering whether we could directly reuse IMPLICIT_DEF, as it's basically the same thing. We normally remove IMPLICIT_DEF and replace it with undef flags, but it would probably be fine to reintroduce it afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I tried reusing IMPLICIT_DEF but based on the resulting RISCV test diffs at least that doesn't work (I saw some vrgather.vi
with equal registers with that change).
The Init Undef Pass when I rewrote it was designed so that it would only ever support the Architecture's that had the Architecture Specific instructions, with this change, have you checked to see if other architectures could support this? Currently with the changes you have submitted, it would continue to only support existing architectures. It may be as simple as removing the check to |
@Stylie777 Yes, that's basically the plan. This patch removes one of the target hooks needed by this pass, but there is another one (getLargestSuperClass). |
The test change seem OK to me. Is the llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td failure real? I think all the ID's will have changed again with the new generic pseudo. They have slowly been being converted to regexs. |
The InitUndef pass currently uses target-specific pseudo instructions, with one pseudo per register class. Instead, add a generic pseudo instruction, which can be used by all targets and register classes.
dcc82f5
to
1c3f59a
Compare
Looks like I'm lucky and only a single value changed... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are no other comments then this LGTM.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/137/builds/4681 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/175/builds/4638 Here is the relevant piece of the build log for the reference
|
Trivial fix, some instruction opcodes changed.
Starting with this commit I'm hitting an assert running tablegen for AArch64:
|
@llvm-beanz I can't reproduce that assertion, and I've not seen it on any buildbots either :/ |
The InitUndef pass currently uses the getLargestSuperClass() hook (which is only used by that pass) to chose the register to initialize. This was done to reduce the number of undef init pseudos needed, e.g. so that the vrnov0 regclass would use the same pseudo as v0. After #106744 we use a single generic pseudo, so this is no longer necessary.
The InitUndef pass works around a register allocation issue, where undef operands can be allocated to the same register as early-clobber result operands. This may lead to ISA constraint violations, where certain input and output registers are not allowed to overlap. Originally this pass was implemented for RISCV, and then extended to ARM in llvm#77770. I've since removed the target-specific parts of the pass in llvm#106744 and llvm#107885. This PR now enables the pass for all targets. The motivating case is the one in arm64-ldxr-stxr.ll for the AArch64 target, where we were previously incorrectly allocating a stxp input and output to the same register.
The InitUndef pass works around a register allocation issue, where undef operands can be allocated to the same register as early-clobber result operands. This may lead to ISA constraint violations, where certain input and output registers are not allowed to overlap. Originally this pass was implemented for RISCV, and then extended to ARM in #77770. I've since removed the target-specific parts of the pass in #106744 and #107885. This PR reduces the pass to use a single requiresDisjointEarlyClobberAndUndef() target hook and enables it by default. The hook is disabled for AMDGPU, because overlapping early-clobber and undef operands are known to be safe for that target, and we get significant codegen diffs otherwise. The motivating case is the one in arm64-ldxr-stxr.ll, where we were previously incorrectly allocating a stxp input and output to the same register.
The InitUndef pass currently uses target-specific pseudo instructions, with one pseudo per register class.
Instead, add a generic pseudo instruction, which can be used by all targets and register classes.