-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[TableGen][GISel] Create untyped registers during instruction selection #121270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[TableGen][GISel] Create untyped registers during instruction selection #121270
Conversation
@llvm/pr-subscribers-tablegen @llvm/pr-subscribers-llvm-regalloc Author: Sergei Barannikov (s-barannikov) ChangesTemporary registers are used for linking instructions in the destination DAG of a pattern, and also for discarded defs. Previously, temporary registers were created without a register class/bank, but with a type. This patch removes the type as well. The type shouldn't matter for GlobalISel; registers created during instruction selection should be virtual (as opposed to generic). Virtual registers must have a register class, it will be inferred when constraining operands of selected instructions. GIR_MakeTempReg action was split into two: GIR_MakeGenericTempReg for use in generic instruction combining, and GIR_MakeVirtualTempReg for use in instruction selection. The latter creates an "incomplete" virtual register (one without a type / regclass / regbank); further actions such as GIR_ConstrainSelectedInstOperands should make sure a register gets a register class.
This change allows importing patterns that have types in the destination DAG that cannot be converted to LLT (such as The number of skipped patterns reduces as follows:
Patch is 56.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121270.diff 26 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
index b4ff4cd178d757..e254edeb178c03 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
@@ -550,10 +550,14 @@ enum {
/// Combines both a GIR_EraseFromParent 0 + GIR_Done
GIR_EraseRootFromParent_Done,
- /// Create a new temporary register that's not constrained.
+ /// Create a new generic temporary register that's not constrained.
/// - TempRegID(ULEB128) - The temporary register ID to initialize.
/// - Ty(1) - Expected type
- GIR_MakeTempReg,
+ GIR_MakeGenericTempReg,
+
+ /// Create a new virtual temporary register that doesn't have register class.
+ /// - TempRegID(ULEB128) - The temporary register ID to initialize.
+ GIR_MakeVirtualTempReg,
/// Replaces all references to a register from an instruction
/// with another register from another instruction.
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
index 2c57f2b5aa029c..efb5eaf77447dc 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
@@ -1485,15 +1485,28 @@ bool GIMatchTableExecutor::executeMatchTable(
propagateFlags();
return true;
}
- case GIR_MakeTempReg: {
+ case GIR_MakeGenericTempReg: {
uint64_t TempRegID = readULEB();
int TypeID = readS8();
State.TempRegisters[TempRegID] =
MRI.createGenericVirtualRegister(getTypeFromIdx(TypeID));
- DEBUG_WITH_TYPE(TgtExecutor::getName(),
- dbgs() << CurrentIdx << ": TempRegs[" << TempRegID
- << "] = GIR_MakeTempReg(" << TypeID << ")\n");
+ DEBUG_WITH_TYPE(TgtExecutor::getName(), {
+ dbgs() << CurrentIdx << ": TempRegs[" << TempRegID
+ << "] = GIR_MakeGenericTempReg(" << TypeID << ")\n";
+ });
+ break;
+ }
+ case GIR_MakeVirtualTempReg: {
+ uint64_t TempRegID = readULEB();
+
+ Register Reg = MRI.createIncompleteVirtualRegister();
+ MRI.noteNewVirtualRegister(Reg);
+ State.TempRegisters[TempRegID] = Reg;
+ DEBUG_WITH_TYPE(TgtExecutor::getName(), {
+ dbgs() << CurrentIdx << ": TempRegs[" << TempRegID
+ << "] = GIR_MakeVirtualTempReg()\n";
+ });
break;
}
case GIR_ReplaceReg: {
diff --git a/llvm/lib/CodeGen/RegisterBankInfo.cpp b/llvm/lib/CodeGen/RegisterBankInfo.cpp
index e1720b038e2361..89edaaf5cd56e3 100644
--- a/llvm/lib/CodeGen/RegisterBankInfo.cpp
+++ b/llvm/lib/CodeGen/RegisterBankInfo.cpp
@@ -134,10 +134,10 @@ const TargetRegisterClass *RegisterBankInfo::constrainGenericRegister(
// If the register already has a class, fallback to MRI::constrainRegClass.
auto &RegClassOrBank = MRI.getRegClassOrRegBank(Reg);
- if (isa<const TargetRegisterClass *>(RegClassOrBank))
+ if (isa_and_nonnull<const TargetRegisterClass *>(RegClassOrBank))
return MRI.constrainRegClass(Reg, &RC);
- const RegisterBank *RB = cast<const RegisterBank *>(RegClassOrBank);
+ const auto *RB = dyn_cast_or_null<const RegisterBank *>(RegClassOrBank);
// Otherwise, all we can do is ensure the bank covers the class, and set it.
if (RB && !RB->covers(RC))
return nullptr;
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index af62623ece6ab6..f25adafb845001 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -175,11 +175,8 @@ Printable printRegClassOrBank(Register Reg, const MachineRegisterInfo &RegInfo,
OS << StringRef(TRI->getRegClassName(RegInfo.getRegClass(Reg))).lower();
else if (RegInfo.getRegBankOrNull(Reg))
OS << StringRef(RegInfo.getRegBankOrNull(Reg)->getName()).lower();
- else {
+ else
OS << "_";
- assert((RegInfo.def_empty(Reg) || RegInfo.getType(Reg).isValid()) &&
- "Generic registers must have a valid type");
- }
});
}
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 704435dad65d7b..4aa793c2b0ff10 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3708,10 +3708,10 @@ const TargetRegisterClass *
SIRegisterInfo::getConstrainedRegClassForOperand(const MachineOperand &MO,
const MachineRegisterInfo &MRI) const {
const RegClassOrRegBank &RCOrRB = MRI.getRegClassOrRegBank(MO.getReg());
- if (const RegisterBank *RB = dyn_cast<const RegisterBank *>(RCOrRB))
+ if (const auto *RB = dyn_cast_or_null<const RegisterBank *>(RCOrRB))
return getRegClassForTypeOnBank(MRI.getType(MO.getReg()), *RB);
- if (const auto *RC = dyn_cast<const TargetRegisterClass *>(RCOrRB))
+ if (const auto *RC = dyn_cast_or_null<const TargetRegisterClass *>(RCOrRB))
return getAllocatableClass(RC);
return nullptr;
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir
index acda00231ec612..95a7a8105fd3f3 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir
@@ -679,8 +679,8 @@ body: |
; SI-NEXT: {{ $}}
; SI-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; SI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; SI-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
;
; VI-LABEL: name: fneg_fabs_s32_vs
@@ -688,8 +688,8 @@ body: |
; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; VI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
;
; GFX9-LABEL: name: fneg_fabs_s32_vs
@@ -697,8 +697,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; GFX9-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
;
; GFX10-LABEL: name: fneg_fabs_s32_vs
@@ -706,8 +706,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; GFX10-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s32) = G_FABS %0
@@ -978,8 +978,8 @@ body: |
; SI-NEXT: {{ $}}
; SI-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; SI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; SI-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
;
; VI-LABEL: name: fneg_fabs_v2s16_vs
@@ -987,8 +987,8 @@ body: |
; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; VI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; VI-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
;
; GFX9-LABEL: name: fneg_fabs_v2s16_vs
@@ -996,8 +996,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; GFX9-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; GFX9-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
;
; GFX10-LABEL: name: fneg_fabs_v2s16_vs
@@ -1005,8 +1005,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; GFX10-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; GFX10-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
%0:sgpr(<2 x s16>) = COPY $sgpr0
%1:vgpr(<2 x s16>) = G_FABS %0
@@ -1148,11 +1148,11 @@ body: |
; SI-NEXT: {{ $}}
; SI-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; SI-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; SI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; SI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; SI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; SI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; SI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; SI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; SI-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
;
; VI-LABEL: name: fneg_fabs_s64_vs
@@ -1160,11 +1160,11 @@ body: |
; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; VI-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; VI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; VI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; VI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; VI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; VI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; VI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; VI-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
;
; GFX9-LABEL: name: fneg_fabs_s64_vs
@@ -1172,11 +1172,11 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
;
; GFX10-LABEL: name: fneg_fabs_s64_vs
@@ -1184,11 +1184,11 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; GFX10-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
%0:sgpr(s64) = COPY $sgpr0_sgpr1
%1:vgpr(s64) = G_FABS %0
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td
index 9c9b39027f8f95..e60ae2d0f940f5 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td
@@ -48,7 +48,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // MIs[1] y
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: GIM_CheckIsSafeToFold, /*NumInsns*/1,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: // Combiner Rule #1: ReplaceTemp
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_UNMERGE_VALUES),
// CHECK-NEXT: GIR_RootToRootCopy, /*OpIdx*/0, // a
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td
index e0b802447ea2a9..5644ffbe68e498 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td
@@ -73,7 +73,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: // MIs[0] Operand 1
// CHECK-NEXT: GIM_CheckConstantInt8, /*MI*/0, /*Op*/1, 0,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: GIR_BuildConstant, /*TempRegID*/0, /*Val*/GIMT_Encode8(0),
// CHECK-NEXT: // Combiner Rule #1: InstTest1
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::COPY),
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
index 365d0c9fbff494..562c56086a63df 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
@@ -41,7 +41,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: // MIs[0] Operand 2
// CHECK-NEXT: GIM_CheckConstantInt8, /*MI*/0, /*Op*/2, 0,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: // Combiner Rule #0: IntrinTest0
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_INTRINSIC),
// CHECK-NEXT: GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
@@ -62,7 +62,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: // MIs[0] b
// CHECK-NEXT: // No operand predicates
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: // Combiner Rule #1: SpecialIntrins
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_INTRINSIC_CONVERGENT),
// CHECK-NEXT: GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td
index a23b54afb51252..f2ba0929d01488 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td
@@ -33,7 +33,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // MIs[1] b
// CHECK-NEXT: GIM_CheckIsSameOperandIgnoreCopies, /*MI*/1, /*OpIdx*/1, /*OtherMI*/0, /*OtherOpIdx*/1,
// CHECK-NEXT: GIM_CheckIsSafeToFold, /*NumInsns*/1,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s64,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s64,
// CHECK-NEXT: // Combiner Rule #0: InstTest0
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_ADD),
// CHECK-NEXT: ...
[truncated]
|
@llvm/pr-subscribers-backend-amdgpu Author: Sergei Barannikov (s-barannikov) ChangesTemporary registers are used for linking instructions in the destination DAG of a pattern, and also for discarded defs. Previously, temporary registers were created without a register class/bank, but with a type. This patch removes the type as well. The type shouldn't matter for GlobalISel; registers created during instruction selection should be virtual (as opposed to generic). Virtual registers must have a register class, it will be inferred when constraining operands of selected instructions. GIR_MakeTempReg action was split into two: GIR_MakeGenericTempReg for use in generic instruction combining, and GIR_MakeVirtualTempReg for use in instruction selection. The latter creates an "incomplete" virtual register (one without a type / regclass / regbank); further actions such as GIR_ConstrainSelectedInstOperands should make sure a register gets a register class.
This change allows importing patterns that have types in the destination DAG that cannot be converted to LLT (such as The number of skipped patterns reduces as follows:
Patch is 56.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121270.diff 26 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
index b4ff4cd178d757..e254edeb178c03 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
@@ -550,10 +550,14 @@ enum {
/// Combines both a GIR_EraseFromParent 0 + GIR_Done
GIR_EraseRootFromParent_Done,
- /// Create a new temporary register that's not constrained.
+ /// Create a new generic temporary register that's not constrained.
/// - TempRegID(ULEB128) - The temporary register ID to initialize.
/// - Ty(1) - Expected type
- GIR_MakeTempReg,
+ GIR_MakeGenericTempReg,
+
+ /// Create a new virtual temporary register that doesn't have register class.
+ /// - TempRegID(ULEB128) - The temporary register ID to initialize.
+ GIR_MakeVirtualTempReg,
/// Replaces all references to a register from an instruction
/// with another register from another instruction.
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
index 2c57f2b5aa029c..efb5eaf77447dc 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
@@ -1485,15 +1485,28 @@ bool GIMatchTableExecutor::executeMatchTable(
propagateFlags();
return true;
}
- case GIR_MakeTempReg: {
+ case GIR_MakeGenericTempReg: {
uint64_t TempRegID = readULEB();
int TypeID = readS8();
State.TempRegisters[TempRegID] =
MRI.createGenericVirtualRegister(getTypeFromIdx(TypeID));
- DEBUG_WITH_TYPE(TgtExecutor::getName(),
- dbgs() << CurrentIdx << ": TempRegs[" << TempRegID
- << "] = GIR_MakeTempReg(" << TypeID << ")\n");
+ DEBUG_WITH_TYPE(TgtExecutor::getName(), {
+ dbgs() << CurrentIdx << ": TempRegs[" << TempRegID
+ << "] = GIR_MakeGenericTempReg(" << TypeID << ")\n";
+ });
+ break;
+ }
+ case GIR_MakeVirtualTempReg: {
+ uint64_t TempRegID = readULEB();
+
+ Register Reg = MRI.createIncompleteVirtualRegister();
+ MRI.noteNewVirtualRegister(Reg);
+ State.TempRegisters[TempRegID] = Reg;
+ DEBUG_WITH_TYPE(TgtExecutor::getName(), {
+ dbgs() << CurrentIdx << ": TempRegs[" << TempRegID
+ << "] = GIR_MakeVirtualTempReg()\n";
+ });
break;
}
case GIR_ReplaceReg: {
diff --git a/llvm/lib/CodeGen/RegisterBankInfo.cpp b/llvm/lib/CodeGen/RegisterBankInfo.cpp
index e1720b038e2361..89edaaf5cd56e3 100644
--- a/llvm/lib/CodeGen/RegisterBankInfo.cpp
+++ b/llvm/lib/CodeGen/RegisterBankInfo.cpp
@@ -134,10 +134,10 @@ const TargetRegisterClass *RegisterBankInfo::constrainGenericRegister(
// If the register already has a class, fallback to MRI::constrainRegClass.
auto &RegClassOrBank = MRI.getRegClassOrRegBank(Reg);
- if (isa<const TargetRegisterClass *>(RegClassOrBank))
+ if (isa_and_nonnull<const TargetRegisterClass *>(RegClassOrBank))
return MRI.constrainRegClass(Reg, &RC);
- const RegisterBank *RB = cast<const RegisterBank *>(RegClassOrBank);
+ const auto *RB = dyn_cast_or_null<const RegisterBank *>(RegClassOrBank);
// Otherwise, all we can do is ensure the bank covers the class, and set it.
if (RB && !RB->covers(RC))
return nullptr;
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index af62623ece6ab6..f25adafb845001 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -175,11 +175,8 @@ Printable printRegClassOrBank(Register Reg, const MachineRegisterInfo &RegInfo,
OS << StringRef(TRI->getRegClassName(RegInfo.getRegClass(Reg))).lower();
else if (RegInfo.getRegBankOrNull(Reg))
OS << StringRef(RegInfo.getRegBankOrNull(Reg)->getName()).lower();
- else {
+ else
OS << "_";
- assert((RegInfo.def_empty(Reg) || RegInfo.getType(Reg).isValid()) &&
- "Generic registers must have a valid type");
- }
});
}
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 704435dad65d7b..4aa793c2b0ff10 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3708,10 +3708,10 @@ const TargetRegisterClass *
SIRegisterInfo::getConstrainedRegClassForOperand(const MachineOperand &MO,
const MachineRegisterInfo &MRI) const {
const RegClassOrRegBank &RCOrRB = MRI.getRegClassOrRegBank(MO.getReg());
- if (const RegisterBank *RB = dyn_cast<const RegisterBank *>(RCOrRB))
+ if (const auto *RB = dyn_cast_or_null<const RegisterBank *>(RCOrRB))
return getRegClassForTypeOnBank(MRI.getType(MO.getReg()), *RB);
- if (const auto *RC = dyn_cast<const TargetRegisterClass *>(RCOrRB))
+ if (const auto *RC = dyn_cast_or_null<const TargetRegisterClass *>(RCOrRB))
return getAllocatableClass(RC);
return nullptr;
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir
index acda00231ec612..95a7a8105fd3f3 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir
@@ -679,8 +679,8 @@ body: |
; SI-NEXT: {{ $}}
; SI-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; SI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; SI-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
;
; VI-LABEL: name: fneg_fabs_s32_vs
@@ -688,8 +688,8 @@ body: |
; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; VI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
;
; GFX9-LABEL: name: fneg_fabs_s32_vs
@@ -697,8 +697,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; GFX9-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
;
; GFX10-LABEL: name: fneg_fabs_s32_vs
@@ -706,8 +706,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
; GFX10-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]]
- ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648
- ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec
+ ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](s32), implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_XOR_B32_e64_]](s32)
%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s32) = G_FABS %0
@@ -978,8 +978,8 @@ body: |
; SI-NEXT: {{ $}}
; SI-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; SI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; SI-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
;
; VI-LABEL: name: fneg_fabs_v2s16_vs
@@ -987,8 +987,8 @@ body: |
; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; VI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; VI-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
;
; GFX9-LABEL: name: fneg_fabs_v2s16_vs
@@ -996,8 +996,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; GFX9-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; GFX9-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
;
; GFX10-LABEL: name: fneg_fabs_v2s16_vs
@@ -1005,8 +1005,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
; GFX10-NEXT: [[FABS:%[0-9]+]]:vgpr_32(<2 x s16>) = G_FABS [[COPY]]
- ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147516416
- ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](<2 x s16>), implicit $exec
+ ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147516416
+ ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(<2 x s16>) = V_XOR_B32_e64 [[S_MOV_B32_]], [[FABS]](<2 x s16>), implicit $exec
; GFX10-NEXT: $vgpr0 = COPY [[V_XOR_B32_e64_]](<2 x s16>)
%0:sgpr(<2 x s16>) = COPY $sgpr0
%1:vgpr(<2 x s16>) = G_FABS %0
@@ -1148,11 +1148,11 @@ body: |
; SI-NEXT: {{ $}}
; SI-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; SI-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; SI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; SI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; SI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; SI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; SI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; SI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; SI-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
;
; VI-LABEL: name: fneg_fabs_s64_vs
@@ -1160,11 +1160,11 @@ body: |
; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; VI-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; VI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; VI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; VI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; VI-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; VI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; VI-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; VI-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
;
; GFX9-LABEL: name: fneg_fabs_s64_vs
@@ -1172,11 +1172,11 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; GFX9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX9-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
;
; GFX10-LABEL: name: fneg_fabs_s64_vs
@@ -1184,11 +1184,11 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[FABS:%[0-9]+]]:vreg_64(s64) = G_FABS [[COPY]]
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub1(s64)
- ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s32) = S_MOV_B32 2147483648
- ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s16) = V_XOR_B32_e64 [[S_MOV_B32_]](s32), [[COPY1]](s32), implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32(s32) = COPY [[FABS]].sub0(s64)
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]](s32), %subreg.sub0, [[V_XOR_B32_e64_]](s16), %subreg.sub1
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub1(s64)
+ ; GFX10-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648
+ ; GFX10-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY1]], implicit $exec
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[FABS]].sub0(s64)
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64(s64) = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[V_XOR_B32_e64_]], %subreg.sub1
; GFX10-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]](s64)
%0:sgpr(s64) = COPY $sgpr0_sgpr1
%1:vgpr(s64) = G_FABS %0
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td
index 9c9b39027f8f95..e60ae2d0f940f5 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td
@@ -48,7 +48,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // MIs[1] y
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: GIM_CheckIsSafeToFold, /*NumInsns*/1,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: // Combiner Rule #1: ReplaceTemp
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_UNMERGE_VALUES),
// CHECK-NEXT: GIR_RootToRootCopy, /*OpIdx*/0, // a
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td
index e0b802447ea2a9..5644ffbe68e498 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td
@@ -73,7 +73,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: // MIs[0] Operand 1
// CHECK-NEXT: GIM_CheckConstantInt8, /*MI*/0, /*Op*/1, 0,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: GIR_BuildConstant, /*TempRegID*/0, /*Val*/GIMT_Encode8(0),
// CHECK-NEXT: // Combiner Rule #1: InstTest1
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::COPY),
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
index 365d0c9fbff494..562c56086a63df 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
@@ -41,7 +41,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: // MIs[0] Operand 2
// CHECK-NEXT: GIM_CheckConstantInt8, /*MI*/0, /*Op*/2, 0,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: // Combiner Rule #0: IntrinTest0
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_INTRINSIC),
// CHECK-NEXT: GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
@@ -62,7 +62,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: // MIs[0] b
// CHECK-NEXT: // No operand predicates
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
// CHECK-NEXT: // Combiner Rule #1: SpecialIntrins
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_INTRINSIC_CONVERGENT),
// CHECK-NEXT: GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td
index a23b54afb51252..f2ba0929d01488 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td
@@ -33,7 +33,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // MIs[1] b
// CHECK-NEXT: GIM_CheckIsSameOperandIgnoreCopies, /*MI*/1, /*OpIdx*/1, /*OtherMI*/0, /*OtherOpIdx*/1,
// CHECK-NEXT: GIM_CheckIsSafeToFold, /*NumInsns*/1,
-// CHECK-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s64,
+// CHECK-NEXT: GIR_MakeGenericTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s64,
// CHECK-NEXT: // Combiner Rule #0: InstTest0
// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_ADD),
// CHECK-NEXT: ...
[truncated]
|
I tried to justify the change, but I'm not entirely sure I'm doing the right thing. |
@@ -679,35 +679,35 @@ body: | | |||
; SI-NEXT: {{ $}} | |||
; SI-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 | |||
; SI-NEXT: [[FABS:%[0-9]+]]:vgpr_32(s32) = G_FABS [[COPY]] | |||
; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32(s16) = S_MOV_B32 2147483648 | |||
; SI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32(s32) = V_XOR_B32_e64 [[S_MOV_B32_]](s16), [[FABS]](s32), implicit $exec | |||
; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 2147483648 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is somehow related to G_FABS
not being selected. I didn't dig deep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output is the same, the type is just dropped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output is the same, the type is just dropped?
Yes. I'll analyze it further.
Temporary registers are used for linking instructions in the destination DAG of a pattern, and also for discarded defs. Previously, temporary registers were created without a register class/bank, but with a type. This patch removes the type as well. The type shouldn't matter for GlobalISel; registers created during instruction selection should be virtual (as opposed to generic). Virtual registers must have a register class, it will be inferred when constraining operands of selected instructions. GIR_MakeTempReg action was split into two: GIR_MakeGenericTempReg for use in generic instruction combining, and GIR_MakeVirtualTempReg for use in instruction selection. The latter creates an "incomplete" virtual register (one without a type / regclass / regbank); further actions such as GIR_ConstrainSelectedInstOperands should make sure a register gets a register class. `TargetRegisterInfo.cpp` was changed to allow printing such "incomplete" registers. `RegisterBankInfo.cpp` and `SIRegisterInfo.cpp` were changed to support `RegClassOrRegBank` with active `TargetRegisterClass` member that has null value. This change allows importing patterns that have types in the destination DAG that cannot be converted to LLT (such as `MVT::Untyped`) and removes the restriction that interior instructions must have one explicit def. The number of skipped patterns reduces as follows: ``` AArch64 8574 -> 8505 (-69) Mips 1212 -> 1211 (-1) ```
1376e99
to
59cbe27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type shouldn't matter for GlobalISel; registers created during instruction selection should be virtual (as opposed to generic).
Generic virtual registers are still virtual registers.
I'm not sure the type is entirely useless during selection. computeKnownBits / simplifyDemandedBits will be able to understand a virtual register that still has a type but will give up on a value without one in a partially selected function.
case GIR_MakeVirtualTempReg: { | ||
uint64_t TempRegID = readULEB(); | ||
|
||
Register Reg = MRI.createIncompleteVirtualRegister(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the class get set here? I'd expect any create action to carry the class at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class is set by constrainOperandRegClass()
called when executing a subsequent GIR_ConstrainOperandRC
or GIR_[Root]ConstrainSelectedInstOperands
action. This function doesn't seem to query a virtual register's type, but it calls a few TII / TRI hooks (e.g. TRI.getConstrainedRegClassForOperand()
) that might in theory do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general that would need the type. I think it's a bad idea to let this be so far removed, and tablegen needs to supply the class to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tablegen needs to supply the class to use
This doesn't seem possible ATM. Producer and consumer of a virtual register might not agree on register class, in which case constrainOperandRegClass
would split it into two virtual registers and insert a cross-class copy. We would need to lift this machinery to tablegen level. I'm not sure this is what we want or if it's even possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to provide the register class for temporaries right when creating them, it will need larger refactoring.
Specifically, we will need to provide the desired register class(-es) to node importing functions, or create temporaries in advance and pass them to these functions. This is necessary for leaf nodes that don't know the resulting register class, and may also be useful for instruction nodes so that we can generate a cross-class copy(-es) if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused.
An additional copy is only needed if getCommonSubClass for the producer and consumer can't find anything. I guess we could handle that in the selector implementation
If we delay emitting this copy, then the register's class would be invalid for the consumer and we shouldn't make decisions based on invalid register class either.
this doesn't work for instructions that do not have concrete register class constraints
This is new to me. How can instruction's register operand not have a concrete register class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear I am 100% against passing around incomplete virtual registers. Incomplete virtual registers are an implementation detail of virtual registers that are not part of the contract of valid MIR that code should have to reason about.
This is new to me. How can instruction's register operand not have a concrete register class?
This is all generic instructions like COPY and IMPLICIT_DEF, but targets can do the same with their own pseudos. This is part of why the tablegen forms of EXTRACT_SUBREG need an explicit register class operand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear I am 100% against passing around incomplete virtual registers. Incomplete virtual registers are an implementation detail of virtual registers that are not part of the contract of valid MIR that code should have to reason about.
I see your point and I can agree with it.
If we delay emitting this copy, then the register's class would be invalid for the consumer and we shouldn't make decisions based on invalid register class either.
I'm not comfortable with creating registers with a class that is only valid for the producer. Why do you think fixing the register class by emitting a COPY is better left to instruction selector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think fixing the register class by emitting a COPY is better left to instruction selector?
I don't, ideally tablegen would fully understand the set of instructions that need to be emitted here and explicitly handle it. But as far as a hack to make progress goes, I think that's better than using incomplete registers if it's hard to get tablegen into a state where that is possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to do that at tablegen level. The changes are going to be non-local and I'll probably need to split them into several smaller patches; I'll keep this one as a draft.
@@ -3708,10 +3708,10 @@ const TargetRegisterClass * | |||
SIRegisterInfo::getConstrainedRegClassForOperand(const MachineOperand &MO, | |||
const MachineRegisterInfo &MRI) const { | |||
const RegClassOrRegBank &RCOrRB = MRI.getRegClassOrRegBank(MO.getReg()); | |||
if (const RegisterBank *RB = dyn_cast<const RegisterBank *>(RCOrRB)) | |||
if (const auto *RB = dyn_cast_or_null<const RegisterBank *>(RCOrRB)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look like the consequences of not immediately setting the register class, and now other code has to be more tolerant of incomplete virtual registers. This is bad, the register should be created with a class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is due to a peculiar property (bug?) of PointerUnion and not strictly related to this change.
When creating a generic virtual register without bank/class using createGenericVirtualRegister()
(this is what instruction selector currently does), the union's active member is set to RegisterBank
for some reason. The first dyn_cast
returns nullptr even though its argument is effectively a nullptr. The second dyn_cast
also returns nullptr because the union doesn't hold TargetRegisterClass.
When using createIncompleteVirtualRegister()
, the active member is set to RegisterClass
(the union is default constructed). The first dyn_cast
returns nullptr because the union doesn't hold RegisterBank, as expected. The second dyn_cast
, however, asserts, because the argument is a null pointer.
That is, depending on which union member is active, dyn_cast
either allows or disallows null pointer argument.
I wanted to make a distinction between virtual registers with and without a type. Probably not the best wording.
This is a good point, I didn't consider these functions.
|
Non-generic virtual register? Selected virtual register?
Yes, @topperc encountered such a case recently. Although if you actually hit the selected instruction, we probably can't analyze it in any useful way. You could, but it's unlikely it will be handled by the implementation.
Untyped is fine, but it should be a complete register with a register class. |
# Conflicts: # llvm/lib/CodeGen/RegisterBankInfo.cpp # llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
Temporary registers are used for linking instructions in the destination DAG of a pattern, and also for discarded defs. Previously, temporary registers were created without a register class/bank, but with a type. This patch removes the type as well.
The type shouldn't matter for GlobalISel; registers created during instruction selection should be virtual (as opposed to generic). Virtual registers must have a register class, it will be inferred when constraining operands of selected instructions.
GIR_MakeTempReg action was split into two: GIR_MakeGenericTempReg for use in generic instruction combining, and GIR_MakeVirtualTempReg for use in instruction selection. The latter creates an "incomplete" virtual register (one without a type / regclass / regbank); further actions such as GIR_ConstrainSelectedInstOperands should make sure a register gets a register class.
TargetRegisterInfo.cpp
was changed to allow printing such "incomplete" registers.RegisterBankInfo.cpp
andSIRegisterInfo.cpp
were changed to supportRegClassOrRegBank
with activeTargetRegisterClass
member that has null value.This change allows importing patterns that have types in the destination DAG that cannot be converted to LLT (such as
MVT::Untyped
) and removes the restriction that interior instructions must have one explicit def.The number of skipped patterns reduces as follows: