Skip to content

Commit fb58d0b

Browse files
committed
[RISCV][TTI] Adjust cost for extract/insert element when VLEN is known
If we know an exact VLEN, then the index is effectively modulo the number of elements in a single vector register. Our lowering performs this subvector optimization. A bit of context. This change may look a bit strange on it's own given we are currently *not* scaling insert/extract cost by LMUL. This costing decision needs to change, but is very intertwined with SLP profitability, and is thus a bit hard to adjust. I'm hoping that llvm#108419 will let me start to untangle this. This change is basically a case of finding a subset I can tackle before other dependencies are in place which does no real harm in the meantime.
1 parent ee40ffd commit fb58d0b

File tree

3 files changed

+112
-0
lines changed

3 files changed

+112
-0
lines changed

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1756,6 +1756,14 @@ InstructionCost RISCVTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
17561756
Index = Index % Width;
17571757
}
17581758

1759+
// If exact VLEN is known, we will insert/extract into the appropriate
1760+
// subvector with no additional subvector insert/extract cost.
1761+
if (auto VLEN = ST->getRealVLen()) {
1762+
unsigned EltSize = LT.second.getScalarSizeInBits();
1763+
unsigned M1Max = *VLEN / EltSize;
1764+
Index = Index % M1Max;
1765+
}
1766+
17591767
// We could extract/insert the first element without vslidedown/vslideup.
17601768
if (Index == 0)
17611769
SlideCost = 0;

llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1501,3 +1501,55 @@ define void @extractelement_int_nonpoweroftwo(i32 %x) {
15011501

15021502
ret void
15031503
}
1504+
1505+
define void @extractelement_vls(i32 %x) vscale_range(2,2) {
1506+
; RV32V-LABEL: 'extractelement_vls'
1507+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
1508+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
1509+
; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
1510+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
1511+
; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
1512+
; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
1513+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
1514+
; RV32V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1515+
;
1516+
; RV64V-LABEL: 'extractelement_vls'
1517+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
1518+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
1519+
; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
1520+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
1521+
; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
1522+
; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
1523+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
1524+
; RV64V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1525+
;
1526+
; RV32ZVE64X-LABEL: 'extractelement_vls'
1527+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
1528+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
1529+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
1530+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
1531+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
1532+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
1533+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
1534+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1535+
;
1536+
; RV64ZVE64X-LABEL: 'extractelement_vls'
1537+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
1538+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
1539+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
1540+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
1541+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
1542+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
1543+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
1544+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1545+
;
1546+
%v32i32_0 = extractelement <32 x i32> undef, i32 0
1547+
%v32i32_4 = extractelement <32 x i32> undef, i32 4
1548+
%v32i32_5 = extractelement <32 x i32> undef, i32 5
1549+
%v32i32_8 = extractelement <32 x i32> undef, i32 8
1550+
%v32i32_9 = extractelement <32 x i32> undef, i32 9
1551+
%v32i32_11 = extractelement <32 x i32> undef, i32 11
1552+
%v32i32_12 = extractelement <32 x i32> undef, i32 12
1553+
1554+
ret void
1555+
}

llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1491,3 +1491,55 @@ define void @insertelement_int_nonpoweroftwo(i32 %x) {
14911491

14921492
ret void
14931493
}
1494+
1495+
define void @insertelement_vls(i32 %x) vscale_range(2,2) {
1496+
; RV32V-LABEL: 'insertelement_vls'
1497+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
1498+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
1499+
; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
1500+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
1501+
; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
1502+
; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
1503+
; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
1504+
; RV32V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1505+
;
1506+
; RV64V-LABEL: 'insertelement_vls'
1507+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
1508+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
1509+
; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
1510+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
1511+
; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
1512+
; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
1513+
; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
1514+
; RV64V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1515+
;
1516+
; RV32ZVE64X-LABEL: 'insertelement_vls'
1517+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
1518+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
1519+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
1520+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
1521+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
1522+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
1523+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
1524+
; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1525+
;
1526+
; RV64ZVE64X-LABEL: 'insertelement_vls'
1527+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
1528+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
1529+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
1530+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
1531+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
1532+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
1533+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
1534+
; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
1535+
;
1536+
%v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
1537+
%v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
1538+
%v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
1539+
%v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
1540+
%v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
1541+
%v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
1542+
%v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
1543+
1544+
ret void
1545+
}

0 commit comments

Comments
 (0)