-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[VPlan] Add opcode to create step for wide inductions. #119284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-powerpc @llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesThis patch adds a WideIVStep opcode that can be used to create a vector with the steps to increment a wide induction. The opcode has 3 operands
The opcode is later converted into a sequence of recipes that convert the scale and step to the target type, if needed, and then multiply vector step by scale. This simplifies code that needs to materialize step vectors, e.g. replacing wide IVs as follow up to Full diff: https://github.com/llvm/llvm-project/pull/119284.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index e1d828f038f9a2..d3cd1a9b128048 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1220,6 +1220,7 @@ class VPInstruction : public VPRecipeWithIRFlags,
CalculateTripCountMinusVF,
// Increment the canonical IV separately for each unrolled part.
CanonicalIVIncrementForPart,
+ WideIVStep,
BranchOnCount,
BranchOnCond,
ComputeReductionResult,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 5903ad29af7602..8bf9b5194932b4 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -661,7 +661,8 @@ bool VPInstruction::isFPMathOp() const {
return Opcode == Instruction::FAdd || Opcode == Instruction::FMul ||
Opcode == Instruction::FNeg || Opcode == Instruction::FSub ||
Opcode == Instruction::FDiv || Opcode == Instruction::FRem ||
- Opcode == Instruction::FCmp || Opcode == Instruction::Select;
+ Opcode == Instruction::FCmp || Opcode == Instruction::Select ||
+ Opcode == VPInstruction::WideIVStep;
}
#endif
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 6d77173735c9b8..41ed8b65b00fd6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -1821,20 +1821,61 @@ void VPlanTransforms::createInterleaveGroups(
}
void VPlanTransforms::convertToConcreteRecipes(VPlan &Plan) {
+ Type *CanonicalIVType = Plan.getCanonicalIV()->getScalarType();
+ VPTypeAnalysis TypeInfo(CanonicalIVType);
+
for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
vp_depth_first_deep(Plan.getEntry()))) {
- for (VPRecipeBase &R : make_early_inc_range(VPBB->phis())) {
- if (!isa<VPCanonicalIVPHIRecipe, VPEVLBasedIVPHIRecipe>(&R))
+ for (VPRecipeBase &R : make_early_inc_range(*VPBB)) {
+ if (isa<VPCanonicalIVPHIRecipe, VPEVLBasedIVPHIRecipe>(&R)) {
+ auto *PhiR = cast<VPHeaderPHIRecipe>(&R);
+ StringRef Name =
+ isa<VPCanonicalIVPHIRecipe>(PhiR) ? "index" : "evl.based.iv";
+ auto *ScalarR = new VPScalarPHIRecipe(PhiR->getStartValue(),
+ PhiR->getBackedgeValue(),
+ PhiR->getDebugLoc(), Name);
+ ScalarR->insertBefore(PhiR);
+ PhiR->replaceAllUsesWith(ScalarR);
+ PhiR->eraseFromParent();
continue;
- auto *PhiR = cast<VPHeaderPHIRecipe>(&R);
- StringRef Name =
- isa<VPCanonicalIVPHIRecipe>(PhiR) ? "index" : "evl.based.iv";
- auto *ScalarR =
- new VPScalarPHIRecipe(PhiR->getStartValue(), PhiR->getBackedgeValue(),
- PhiR->getDebugLoc(), Name);
- ScalarR->insertBefore(PhiR);
- PhiR->replaceAllUsesWith(ScalarR);
- PhiR->eraseFromParent();
+ }
+
+ auto *VPI = dyn_cast<VPInstruction>(&R);
+ if (VPI && VPI->getOpcode() == VPInstruction::WideIVStep) {
+ VPBuilder Builder(VPI->getParent(), VPI->getIterator());
+ VPValue *VectorStep = VPI->getOperand(0);
+ Type *IVTy = TypeInfo.inferScalarType(VPI->getOperand(2));
+ if (TypeInfo.inferScalarType(VectorStep) != IVTy) {
+ Instruction::CastOps CastOp = IVTy->isFloatingPointTy()
+ ? Instruction::UIToFP
+ : Instruction::Trunc;
+ VectorStep = Builder.createWidenCast(CastOp, VectorStep, IVTy);
+ }
+
+ VPValue *ScalarStep = VPI->getOperand(1);
+ auto *ConstStep =
+ ScalarStep->isLiveIn()
+ ? dyn_cast<ConstantInt>(ScalarStep->getLiveInIRValue())
+ : nullptr;
+ if (!ConstStep || ConstStep->getValue() != 1) {
+ if (TypeInfo.inferScalarType(ScalarStep) != IVTy) {
+ ScalarStep =
+ Builder.createWidenCast(Instruction::Trunc, ScalarStep, IVTy);
+ }
+
+ std::optional<FastMathFlags> FMFs;
+ if (IVTy->isFloatingPointTy())
+ FMFs = VPI->getFastMathFlags();
+
+ unsigned MulOpc =
+ IVTy->isFloatingPointTy() ? Instruction::FMul : Instruction::Mul;
+ VPInstruction *Mul = Builder.createNaryOp(
+ MulOpc, {VectorStep, ScalarStep}, FMFs, R.getDebugLoc());
+ VectorStep = Mul;
+ }
+ VPI->replaceAllUsesWith(VectorStep);
+ VPI->eraseFromParent();
+ }
}
}
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
index ff6c9295ee2057..7c1bb98c1a021f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
@@ -155,33 +155,15 @@ void UnrollState::unrollWidenInductionByUF(
if (isa_and_present<FPMathOperator>(ID.getInductionBinOp()))
FMFs = ID.getInductionBinOp()->getFastMathFlags();
- VPValue *VectorStep = &Plan.getVF();
- VPBuilder Builder(PH);
- if (TypeInfo.inferScalarType(VectorStep) != IVTy) {
- Instruction::CastOps CastOp =
- IVTy->isFloatingPointTy() ? Instruction::UIToFP : Instruction::Trunc;
- VectorStep = Builder.createWidenCast(CastOp, VectorStep, IVTy);
- ToSkip.insert(VectorStep->getDefiningRecipe());
- }
-
VPValue *ScalarStep = IV->getStepValue();
- auto *ConstStep = ScalarStep->isLiveIn()
- ? dyn_cast<ConstantInt>(ScalarStep->getLiveInIRValue())
- : nullptr;
- if (!ConstStep || ConstStep->getValue() != 1) {
- if (TypeInfo.inferScalarType(ScalarStep) != IVTy) {
- ScalarStep =
- Builder.createWidenCast(Instruction::Trunc, ScalarStep, IVTy);
- ToSkip.insert(ScalarStep->getDefiningRecipe());
- }
+ VPBuilder Builder(PH);
+ VPInstruction *VectorStep =
+ Builder.createNaryOp(VPInstruction::WideIVStep,
+ {&Plan.getVF(), ScalarStep,
+ Plan.getOrAddLiveIn(Constant::getNullValue(IVTy))},
+ FMFs, IV->getDebugLoc());
- unsigned MulOpc =
- IVTy->isFloatingPointTy() ? Instruction::FMul : Instruction::Mul;
- VPInstruction *Mul = Builder.createNaryOp(MulOpc, {VectorStep, ScalarStep},
- FMFs, IV->getDebugLoc());
- VectorStep = Mul;
- ToSkip.insert(Mul);
- }
+ ToSkip.insert(VectorStep);
// Now create recipes to compute the induction steps for part 1 .. UF. Part 0
// remains the header phi. Parts > 0 are computed by adding Step to the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to return true for WideIVStep in VPInstruction::onlyFirstLaneUsed if both inputs are scalar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I was able to use this in #118638 but it looks like there's difference with how this widens the VF + step before multiplying, as opposed to after e.g.:
--- a/llvm/test/Transforms/LoopVectorize/AArch64/clamped-trip-count.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/clamped-trip-count.ll
@@ -20,9 +20,9 @@ define void @clamped_tc_8(ptr nocapture %dst, i32 %n, i64 %val) vscale_range(1,1
; CHECK-NEXT: [[TMP8:%.*]] = call <vscale x 8 x i64> @llvm.stepvector.nxv8i64()
; CHECK-NEXT: [[TMP7:%.*]] = mul <vscale x 8 x i64> [[TMP8]], splat (i64 1)
; CHECK-NEXT: [[INDUCTION:%.*]] = add <vscale x 8 x i64> zeroinitializer, [[TMP7]]
-; CHECK-NEXT: [[TMP12:%.*]] = mul i64 1, [[TMP6]]
-; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <vscale x 8 x i64> poison, i64 [[TMP12]], i64 0
+; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <vscale x 8 x i64> poison, i64 [[TMP6]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <vscale x 8 x i64> [[DOTSPLATINSERT]], <vscale x 8 x i64> poison, <vscale x 8 x i32> zeroinitializer
+; CHECK-NEXT: [[TMP9:%.*]] = mul <vscale x 8 x i64> [[DOTSPLAT]], splat (i64 1)
; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <vscale x 8 x i64> poison, i64 [[VAL]], i64 0
; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <vscale x 8 x i64> [[BROADCAST_SPLATINSERT]], <vscale x 8 x i64> poison, <vscale x 8 x i32> zeroinitializer
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
Does this still work if the createWidenCasts are replaced with createScalarCast and then splatted afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests or make it NFC
auto *VPI = dyn_cast<VPInstruction>(&R); | ||
if (VPI && VPI->getOpcode() == VPInstruction::WideIVStep) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto *VPI = dyn_cast<VPInstruction>(&R); | |
if (VPI && VPI->getOpcode() == VPInstruction::WideIVStep) { | |
if (auto *VPI = dyn_cast<VPInstruction>(&R); VPI && VPI->getOpcode() == VPInstruction::WideIVStep) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto *VPI = dyn_cast<VPInstruction>(&R); | |
if (VPI && VPI->getOpcode() == VPInstruction::WideIVStep) { | |
auto *VPI = dyn_cast<VPInstruction>(&R); | |
if (!VPI || VPI->getOpcode()!= VPInstruction::WideIVStep) | |
continue; |
sinking the original early continue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use pattern matching and have early continue, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to return true for WideIVStep in VPInstruction::onlyFirstLaneUsed if both inputs are scalar?
I haven't checked if there are any such cases, are there by the uses in the follow-up PR?
In the follow-up it truncates the Step itself with a VPScalarCastRecipe, and when the scalar cast was used by WideIVStep it caused this assertion to trigger: llvm-project/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp Lines 2328 to 2331 in e2a94a9
I guess WidenIntOrFpInduction does its own truncation of the step because it wants to reuse the truncation when generating the widened start value too, not just the widened step. But I was able to work around it by just passing the non-truncated step to WideIVStep, and that seemed to work fine. So not a blocking comment! Here's the commit where I plugged it in for reference: lukel97@6f85b48 |
ScalarStep->isLiveIn() | ||
? dyn_cast<ConstantInt>(ScalarStep->getLiveInIRValue()) | ||
: nullptr; | ||
if (!ConstStep || ConstStep->getValue() != 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this redundant case be folded away by some simplifyRecipes before reaching convertToConcrete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the PR to do so, this requires running simplification after unrolling (WideIVStep is introduced there) which can also help remove redundant SCALAR-STEPS and possibly others, put up #123655 to do that separately. (And included it in this PR for now)
VPInstruction *Mul = Builder.createNaryOp( | ||
MulOpc, {VectorStep, ScalarStep}, FMFs, R.getDebugLoc()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This recipe essentially represents a multiplication producing a product of desired type, with potential type conversion or casts of its operands introduced later when lowered, including UIToFP and truncation only.
What are the assumptions about supported operand and result types, e.g., can both operands be FPs but desired product be integer.
Is this folding of casts into "generalized" operations desired in general, beyond multiplication? In the long-run should VPlan represent casts consistently, either (all) explicit or (all) delayed, at suitable VPlan transformation times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment VPWidenIntOrFpInductionRecipe
already takes care of the casts early on and there are other cases where modeling the casts implicitly early on helps to allow more accurate cost-modeling (.e.g #113903)
ScalarStep = | ||
Builder.createWidenCast(Instruction::Trunc, ScalarStep, IVTy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should UIToFP be considered here too instead of Trunc if IVTy is float, or is ScalarStep expected to be float whenever IVTy is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The step type can only be float if the induction itself is float
Type *IVTy = TypeInfo.inferScalarType(VPI->getOperand(2)); | ||
if (TypeInfo.inferScalarType(VectorStep) != IVTy) { | ||
Instruction::CastOps CastOp = IVTy->isFloatingPointTy() | ||
? Instruction::UIToFP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VectorStep assumed unsigned, if integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but trunc should preserve the sign
auto *VPI = dyn_cast<VPInstruction>(&R); | ||
if (VPI && VPI->getOpcode() == VPInstruction::WideIVStep) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto *VPI = dyn_cast<VPInstruction>(&R); | |
if (VPI && VPI->getOpcode() == VPInstruction::WideIVStep) { | |
auto *VPI = dyn_cast<VPInstruction>(&R); | |
if (!VPI || VPI->getOpcode()!= VPInstruction::WideIVStep) | |
continue; |
sinking the original early continue?
if (TypeInfo.inferScalarType(VectorStep) != IVTy) { | ||
Instruction::CastOps CastOp = IVTy->isFloatingPointTy() | ||
? Instruction::UIToFP | ||
: Instruction::Trunc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VectorStep assumed integer, if IVTy is? I.e., FPToUI not considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
After unrolling, there may be additional simplifications that can be applied. One example is removing SCALAR-STEPS for the first part where only the first lane is demanded. This removes redundant adds of 0 from a large number of tests (~200), many which I am still working on updating. In preparation for removing redundant WideIV steps added in llvm#119284.
This patch adds a WideIVStep opcode that can be used to create a vector with the steps to increment a wide induction. The opcode has 3 operands * the vector step * the scale of the vector step * a constant indicating the target type of the VPInstruction (this is working around having explicit types for VPInstructions, we could also introduce a dedicated recipe, at the cost of a lot more scaffolding) The opcode is later converted into a sequence of recipes that convert the scale and step to the target type, if needed, and then multiply vector step by scale. This simplifies code that needs to materialize step vectors, e.g. replacing wide IVs as follow up to llvm#108378 with an increment of the wide IV step.
✅ With the latest revision this PR passed the C/C++ code formatter. |
I think I forgot to mentioned this earlier, I split off the change to run recipe simplification late to #123655. The commit is included here. |
…uctionRecipe Split off from llvm#118638, this adds a new VPInstruction for integer step vectors (0,1,2,...), so that we can eventually model all the separate parts of VPWidenIntOrFpInductionRecipe in VPlan. The type of the element is specified through a sentinel value as is done in llvm#119284. This is then used by VPWidenIntOrFpInductionRecipe, where we add it just before execution in convertToConcreteRecipes. We need a dummy placeholder operand so we have somewhere to pass it, but this should go away when #llvm#118638 lands.
…(NFC) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * llvm#129508 * llvm#119284
…(NFC) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * llvm#129508 * llvm#119284
After unrolling, there may be additional simplifications that can be applied. One example is removing SCALAR-STEPS for the first part where only the first lane is demanded. This removes redundant adds of 0 from a large number of tests (~200), many which I am still working on updating. In preparation for removing redundant WideIV steps added in #119284. PR: #123655
After unrolling, there may be additional simplifications that can be applied. One example is removing SCALAR-STEPS for the first part where only the first lane is demanded. This removes redundant adds of 0 from a large number of tests (~200), many which I am still working on updating. In preparation for removing redundant WideIV steps added in llvm/llvm-project#119284. PR: llvm/llvm-project#123655
…NFC) (#129706) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * #129508 * #119284 PR: #129706
…ScalarCast(NFC) (#129706) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * llvm/llvm-project#129508 * llvm/llvm-project#119284 PR: llvm/llvm-project#129706
…NFC) (llvm#129706) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * llvm#129508 * llvm#119284 PR: llvm#129706
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some nits
@@ -2381,6 +2379,7 @@ void VPlanTransforms::convertToConcreteRecipes(VPlan &Plan, | |||
VPTypeAnalysis &TypeInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just pass in the canonical IV and construct the TypeInfo in convertToConcreteRecipes, since I think it'll be invalid once convertToConcreteRecipes returns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think it would be clearer to pass the type analysis interface wise, but undone for now, until your patch lands adding support for invalidating entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I can break that off into a separate patch if you'd like
…19284) This patch adds a WideIVStep opcode that can be used to create a vector with the steps to increment a wide induction. The opcode has 2 operands * the vector step * the scale of the vector step The opcode is later converted into a sequence of recipes that convert the scale and step to the target type, if needed, and then multiply vector step by scale. This simplifies code that needs to materialize step vectors, e.g. replacing wide IVs as follow up to llvm/llvm-project#108378 with an increment of the wide IV step. PR: llvm/llvm-project#119284
I've landed 888b3ed to fix a warning from this PR. Thanks! |
…NFC) (llvm#129706) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * llvm#129508 * llvm#119284 PR: llvm#129706
This patch adds a WideIVStep opcode that can be used to create a vector with the steps to increment a wide induction. The opcode has 2 operands * the vector step * the scale of the vector step The opcode is later converted into a sequence of recipes that convert the scale and step to the target type, if needed, and then multiply vector step by scale. This simplifies code that needs to materialize step vectors, e.g. replacing wide IVs as follow up to llvm#108378 with an increment of the wide IV step. PR: llvm#119284
This patch adds a WideIVStep opcode that can be used to create a vector with the steps to increment a wide induction. The opcode has 3 operands
The opcode is later converted into a sequence of recipes that convert the scale and step to the target type, if needed, and then multiply vector step by scale.
This simplifies code that needs to materialize step vectors, e.g. replacing wide IVs as follow up to
#108378 with an increment of the wide IV step.