Skip to content

[SLP] Pass operand info to getCmpSelInstrInfo #109998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 25, 2024

Conversation

preames
Copy link
Collaborator

@preames preames commented Sep 25, 2024

Depending on the constant, selects with constant arms can have highly varying cost. This adjusts SLP to use the new API introduced in d288574.

Fixes #109466.

Depending on the constant, selects with constant arms can have
highly varying cost.  This adjusts SLP to use the new API introduced
in d288574.

Fixes llvm#109466.
@llvmbot
Copy link
Member

llvmbot commented Sep 25, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Philip Reames (preames)

Changes

Depending on the constant, selects with constant arms can have highly varying cost. This adjusts SLP to use the new API introduced in d288574.

Fixes #109466.


Full diff: https://github.com/llvm/llvm-project/pull/109998.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+4-4)
  • (modified) llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll (+11-7)
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 1dc749f2f13c72..c6f35c700b2e04 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10625,8 +10625,8 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
 
       InstructionCost ScalarCost = TTI->getCmpSelInstrCost(
           E->getOpcode(), OrigScalarTy, Builder.getInt1Ty(), CurrentPred,
-          CostKind, {TTI::OK_AnyValue, TTI::OP_None},
-          {TTI::OK_AnyValue, TTI::OP_None}, VI);
+          CostKind, getOperandInfo(VI->getOperand(0)),
+          getOperandInfo(VI->getOperand(1)), VI);
       InstructionCost IntrinsicCost = GetMinMaxCost(OrigScalarTy, VI);
       if (IntrinsicCost.isValid())
         ScalarCost = IntrinsicCost;
@@ -10638,8 +10638,8 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
 
       InstructionCost VecCost =
           TTI->getCmpSelInstrCost(E->getOpcode(), VecTy, MaskTy, VecPred,
-                                  CostKind, {TTI::OK_AnyValue, TTI::OP_None},
-                                  {TTI::OK_AnyValue, TTI::OP_None}, VL0);
+                                  CostKind, getOperandInfo(E->getOperand(0)),
+                                  getOperandInfo(E->getOperand(1)), VL0);
       if (auto *SI = dyn_cast<SelectInst>(VL0)) {
         auto *CondType =
             getWidenedType(SI->getCondition()->getType(), VL.size());
diff --git a/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll
index 4496b19fa200c5..9cfc5f86cb014a 100644
--- a/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/select-profitability.ll
@@ -31,13 +31,17 @@ define i32 @pow2_zero_constant_shift(i16 zeroext %a, i16 zeroext %b, i16 zeroext
 define i32 @pow2_zero_variable_shift(i16 zeroext %a, i16 zeroext %b, i16 zeroext %c, i16 zeroext %d) {
 ; CHECK-LABEL: define i32 @pow2_zero_variable_shift(
 ; CHECK-SAME: i16 zeroext [[A:%.*]], i16 zeroext [[B:%.*]], i16 zeroext [[C:%.*]], i16 zeroext [[D:%.*]]) #[[ATTR0]] {
-; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <4 x i16> poison, i16 [[A]], i32 0
-; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <4 x i16> [[TMP1]], i16 [[B]], i32 1
-; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <4 x i16> [[TMP2]], i16 [[C]], i32 2
-; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <4 x i16> [[TMP3]], i16 [[D]], i32 3
-; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq <4 x i16> [[TMP4]], <i16 1, i16 1, i16 1, i16 1>
-; CHECK-NEXT:    [[TMP6:%.*]] = select <4 x i1> [[TMP5]], <4 x i32> <i32 524288, i32 262144, i32 131072, i32 65536>, <4 x i32> zeroinitializer
-; CHECK-NEXT:    [[OR_RDX2:%.*]] = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> [[TMP6]])
+; CHECK-NEXT:    [[T39_I0:%.*]] = icmp eq i16 [[A]], 1
+; CHECK-NEXT:    [[T39_I1:%.*]] = icmp eq i16 [[B]], 1
+; CHECK-NEXT:    [[T39_I2:%.*]] = icmp eq i16 [[C]], 1
+; CHECK-NEXT:    [[T39_I3:%.*]] = icmp eq i16 [[D]], 1
+; CHECK-NEXT:    [[T40_I0:%.*]] = select i1 [[T39_I0]], i32 524288, i32 0
+; CHECK-NEXT:    [[T40_I1:%.*]] = select i1 [[T39_I1]], i32 262144, i32 0
+; CHECK-NEXT:    [[T40_I2:%.*]] = select i1 [[T39_I2]], i32 131072, i32 0
+; CHECK-NEXT:    [[T40_I3:%.*]] = select i1 [[T39_I3]], i32 65536, i32 0
+; CHECK-NEXT:    [[OR_RDX0:%.*]] = or i32 [[T40_I0]], [[T40_I1]]
+; CHECK-NEXT:    [[OR_RDX1:%.*]] = or i32 [[T40_I2]], [[T40_I3]]
+; CHECK-NEXT:    [[OR_RDX2:%.*]] = or i32 [[OR_RDX0]], [[OR_RDX1]]
 ; CHECK-NEXT:    ret i32 [[OR_RDX2]]
 ;
   %t39.i0 = icmp eq i16 %a, 1

@preames
Copy link
Collaborator Author

preames commented Sep 25, 2024

FYI, this is the same change as originally posted under d288574 before it was requested to be split. I looked at the other call sites in SLP, and couldn't easily identify how to test them. I'd like to move forward with only these.

Copy link
Member

@alexey-bataev alexey-bataev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@preames preames merged commit 556ec4a into llvm:main Sep 25, 2024
7 of 10 checks passed
@preames preames deleted the pr-slp-select-constant-mat-cost branch September 25, 2024 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISCV] Unprofitable select vectorization/lowering
3 participants