Skip to content

Commit 27394ae

Browse files
committed
[RISCV][CostModel] Calculate cost of Extract/InsertElement with non-constant index when vector instructions are not available
When vector instructions are not available, Extract/InsertElement with N-element vector types and non-constant indexes are lowered in this way: ExtractElement: N stores of each element on stack, 1 load of required element InsertElement: N stores of each element on stack, 1 store to rewrite required element, N loads of each element back This patch implements cost calculation of these operations to fix compilation time of matrix-types-spec test.
1 parent bf83e01 commit 27394ae

File tree

5 files changed

+192
-174
lines changed

5 files changed

+192
-174
lines changed

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1463,8 +1463,26 @@ InstructionCost RISCVTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
14631463
std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(Val);
14641464

14651465
// This type is legalized to a scalar type.
1466-
if (!LT.second.isVector())
1467-
return 0;
1466+
if (!LT.second.isVector()) {
1467+
auto *FixedVecTy = cast<FixedVectorType>(Val);
1468+
// If Index is a known constant, cost is zero.
1469+
if (Index != -1U)
1470+
return 0;
1471+
// Extract/InsertElement with non-constant index is very costly when
1472+
// scalarized; estimate cost of loads/stores sequence via the stack:
1473+
// ExtractElement cost: store vector to stack, load scalar;
1474+
// InsertElement cost: store vector to stack, store scalar, load vector.
1475+
Type *ElemTy = FixedVecTy->getElementType();
1476+
auto NumElems = FixedVecTy->getNumElements();
1477+
auto Align = DL.getPrefTypeAlign(ElemTy);
1478+
InstructionCost LoadCost =
1479+
getMemoryOpCost(Instruction::Load, ElemTy, Align, 0, CostKind);
1480+
InstructionCost StoreCost =
1481+
getMemoryOpCost(Instruction::Store, ElemTy, Align, 0, CostKind);
1482+
return Opcode == Instruction::ExtractElement
1483+
? StoreCost * NumElems + LoadCost
1484+
: (StoreCost + LoadCost) * NumElems + StoreCost;
1485+
}
14681486

14691487
// For unsupported scalable vector.
14701488
if (LT.second.isScalableVector() && !LT.first.isValid())

llvm/test/Analysis/CostModel/RISCV/extractelement.ll

Lines changed: 56 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -4,48 +4,48 @@
44

55
define void @extractelement_int(i32 %x) {
66
; RV32-LABEL: 'extractelement_int'
7-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
8-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
9-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
10-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
7+
; RV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
8+
; RV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
9+
; RV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
10+
; RV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
1111
; RV32-NEXT: Cost Model: Invalid cost for instruction: %nxv16i8 = extractelement <vscale x 16 x i8> undef, i32 %x
12-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
13-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
14-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
15-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
12+
; RV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
13+
; RV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
14+
; RV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
15+
; RV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
1616
; RV32-NEXT: Cost Model: Invalid cost for instruction: %nxv16i16 = extractelement <vscale x 16 x i16> undef, i32 %x
17-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
18-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
19-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
20-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
17+
; RV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
18+
; RV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
19+
; RV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
20+
; RV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
2121
; RV32-NEXT: Cost Model: Invalid cost for instruction: %nxv16i32 = extractelement <vscale x 16 x i32> undef, i32 %x
22-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
23-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
24-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
25-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
22+
; RV32-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
23+
; RV32-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
24+
; RV32-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
25+
; RV32-NEXT: Cost Model: Found an estimated cost of 34 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
2626
; RV32-NEXT: Cost Model: Invalid cost for instruction: %nxv16i64 = extractelement <vscale x 16 x i64> undef, i32 %x
2727
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
2828
;
2929
; RV64-LABEL: 'extractelement_int'
30-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
31-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
32-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
33-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
30+
; RV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
31+
; RV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
32+
; RV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
33+
; RV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
3434
; RV64-NEXT: Cost Model: Invalid cost for instruction: %nxv16i8 = extractelement <vscale x 16 x i8> undef, i32 %x
35-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
36-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
37-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
38-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
35+
; RV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
36+
; RV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
37+
; RV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
38+
; RV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
3939
; RV64-NEXT: Cost Model: Invalid cost for instruction: %nxv16i16 = extractelement <vscale x 16 x i16> undef, i32 %x
40-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
41-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
42-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
43-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
40+
; RV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
41+
; RV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
42+
; RV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
43+
; RV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
4444
; RV64-NEXT: Cost Model: Invalid cost for instruction: %nxv16i32 = extractelement <vscale x 16 x i32> undef, i32 %x
45-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
46-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
47-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
48-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
45+
; RV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
46+
; RV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
47+
; RV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
48+
; RV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
4949
; RV64-NEXT: Cost Model: Invalid cost for instruction: %nxv16i64 = extractelement <vscale x 16 x i64> undef, i32 %x
5050
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
5151
;
@@ -78,38 +78,38 @@ define void @extractelement_int(i32 %x) {
7878

7979
define void @extractelement_fp(i32 %x) {
8080
; RV32-LABEL: 'extractelement_fp'
81-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
82-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
83-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
84-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
81+
; RV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
82+
; RV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
83+
; RV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
84+
; RV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
8585
; RV32-NEXT: Cost Model: Invalid cost for instruction: %nxv16f16 = extractelement <vscale x 16 x half> undef, i32 %x
86-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
87-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
88-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
89-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
86+
; RV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
87+
; RV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
88+
; RV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
89+
; RV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
9090
; RV32-NEXT: Cost Model: Invalid cost for instruction: %nxv16f32 = extractelement <vscale x 16 x float> undef, i32 %x
91-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
92-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
93-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
94-
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
91+
; RV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
92+
; RV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
93+
; RV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
94+
; RV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
9595
; RV32-NEXT: Cost Model: Invalid cost for instruction: %nxv16f64 = extractelement <vscale x 16 x double> undef, i32 %x
9696
; RV32-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
9797
;
9898
; RV64-LABEL: 'extractelement_fp'
99-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
100-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
101-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
102-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
99+
; RV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
100+
; RV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
101+
; RV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
102+
; RV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
103103
; RV64-NEXT: Cost Model: Invalid cost for instruction: %nxv16f16 = extractelement <vscale x 16 x half> undef, i32 %x
104-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
105-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
106-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
107-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
104+
; RV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
105+
; RV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
106+
; RV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
107+
; RV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
108108
; RV64-NEXT: Cost Model: Invalid cost for instruction: %nxv16f32 = extractelement <vscale x 16 x float> undef, i32 %x
109-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
110-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
111-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
112-
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
109+
; RV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
110+
; RV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
111+
; RV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
112+
; RV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
113113
; RV64-NEXT: Cost Model: Invalid cost for instruction: %nxv16f64 = extractelement <vscale x 16 x double> undef, i32 %x
114114
; RV64-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
115115
;

0 commit comments

Comments
 (0)