-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[SLP] Pass operand info to getCmpSelInstrInfo #109998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Depending on the constant, selects with constant arms can have highly varying cost. This adjusts SLP to use the new API introduced in d288574. Fixes llvm#109466.
@llvm/pr-subscribers-llvm-transforms Author: Philip Reames (preames) ChangesDepending on the constant, selects with constant arms can have highly varying cost. This adjusts SLP to use the new API introduced in d288574. Fixes #109466. Full diff: https://github.com/llvm/llvm-project/pull/109998.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 1dc749f2f13c72..c6f35c700b2e04 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10625,8 +10625,8 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
InstructionCost ScalarCost = TTI->getCmpSelInstrCost(
E->getOpcode(), OrigScalarTy, Builder.getInt1Ty(), CurrentPred,
- CostKind, {TTI::OK_AnyValue, TTI::OP_None},
- {TTI::OK_AnyValue, TTI::OP_None}, VI);
+ CostKind, getOperandInfo(VI->getOperand(0)),
+ getOperandInfo(VI->getOperand(1)), VI);
InstructionCost IntrinsicCost = GetMinMaxCost(OrigScalarTy, VI);
if (IntrinsicCost.isValid())
ScalarCost = IntrinsicCost;
@@ -10638,8 +10638,8 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
InstructionCost VecCost =
TTI->getCmpSelInstrCost(E->getOpcode(), VecTy, MaskTy, VecPred,
- CostKind, {TTI::OK_AnyValue, TTI::OP_None},
- {TTI::OK_AnyValue, TTI::OP_None}, VL0);
+ CostKind, getOperandInfo(E->getOperand(0)),
+ getOperandInfo(E->getOperand(1)), VL0);
if (auto *SI = dyn_cast<SelectInst>(VL0)) {
auto *CondType =
getWidenedType(SI->getCondition()->getType(), VL.size());
diff --git a/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll
index 4496b19fa200c5..9cfc5f86cb014a 100644
--- a/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll
@@ -31,13 +31,17 @@ define i32 @pow2_zero_constant_shift(i16 zeroext %a, i16 zeroext %b, i16 zeroext
define i32 @pow2_zero_variable_shift(i16 zeroext %a, i16 zeroext %b, i16 zeroext %c, i16 zeroext %d) {
; CHECK-LABEL: define i32 @pow2_zero_variable_shift(
; CHECK-SAME: i16 zeroext [[A:%.*]], i16 zeroext [[B:%.*]], i16 zeroext [[C:%.*]], i16 zeroext [[D:%.*]]) #[[ATTR0]] {
-; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i16> poison, i16 [[A]], i32 0
-; CHECK-NEXT: [[TMP2:%.*]] = insertelement <4 x i16> [[TMP1]], i16 [[B]], i32 1
-; CHECK-NEXT: [[TMP3:%.*]] = insertelement <4 x i16> [[TMP2]], i16 [[C]], i32 2
-; CHECK-NEXT: [[TMP4:%.*]] = insertelement <4 x i16> [[TMP3]], i16 [[D]], i32 3
-; CHECK-NEXT: [[TMP5:%.*]] = icmp eq <4 x i16> [[TMP4]], <i16 1, i16 1, i16 1, i16 1>
-; CHECK-NEXT: [[TMP6:%.*]] = select <4 x i1> [[TMP5]], <4 x i32> <i32 524288, i32 262144, i32 131072, i32 65536>, <4 x i32> zeroinitializer
-; CHECK-NEXT: [[OR_RDX2:%.*]] = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> [[TMP6]])
+; CHECK-NEXT: [[T39_I0:%.*]] = icmp eq i16 [[A]], 1
+; CHECK-NEXT: [[T39_I1:%.*]] = icmp eq i16 [[B]], 1
+; CHECK-NEXT: [[T39_I2:%.*]] = icmp eq i16 [[C]], 1
+; CHECK-NEXT: [[T39_I3:%.*]] = icmp eq i16 [[D]], 1
+; CHECK-NEXT: [[T40_I0:%.*]] = select i1 [[T39_I0]], i32 524288, i32 0
+; CHECK-NEXT: [[T40_I1:%.*]] = select i1 [[T39_I1]], i32 262144, i32 0
+; CHECK-NEXT: [[T40_I2:%.*]] = select i1 [[T39_I2]], i32 131072, i32 0
+; CHECK-NEXT: [[T40_I3:%.*]] = select i1 [[T39_I3]], i32 65536, i32 0
+; CHECK-NEXT: [[OR_RDX0:%.*]] = or i32 [[T40_I0]], [[T40_I1]]
+; CHECK-NEXT: [[OR_RDX1:%.*]] = or i32 [[T40_I2]], [[T40_I3]]
+; CHECK-NEXT: [[OR_RDX2:%.*]] = or i32 [[OR_RDX0]], [[OR_RDX1]]
; CHECK-NEXT: ret i32 [[OR_RDX2]]
;
%t39.i0 = icmp eq i16 %a, 1
|
FYI, this is the same change as originally posted under d288574 before it was requested to be split. I looked at the other call sites in SLP, and couldn't easily identify how to test them. I'd like to move forward with only these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
Depending on the constant, selects with constant arms can have highly varying cost. This adjusts SLP to use the new API introduced in d288574.
Fixes #109466.