Skip to content

Commit 5c65a32

Browse files
NexMingyanming
and
yanming
authored
[RISCV] Vectorize phi for loop carried @llvm.vp.reduce.* (#131974)
LLVM vector predication reduction intrinsics return a scalar result, but on RISC-V vector reduction instructions write the result in the first element of a vector register. So when a reduction in a loop uses a scalar phi, we end up with unnecessary scalar moves: ```asm loop: vmv.s.x v8, zero vredsum.vs v8, v10, v8 vmv.x.s a0, v8 ```` This mainly affects vector predication reduction. This tries to vectorize any scalar phis that feed into a vector predication reduction in RISCVCodeGenPrepare, converting: ```llvm vector.body: %red.phi = phi i32 [ ..., %entry ], [ %red, %vector.body ] %red = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %red.phi, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl) ``` to ```llvm vector.body: %red.phi = phi <vscale x 2 x i32> [ ..., %entry ], [ %acc.vec, %vector.body] %phi.scalar = extractelement <vscale x 2 x i32> %red.phi, i64 0 %acc = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %phi.scalar, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl) %acc.vec = insertelement <vscale x 2 x i32> poison, float %acc, i64 0 ``` Which eliminates the scalar -> vector -> scalar crossing during instruction selection. --------- Co-authored-by: yanming <[email protected]>
1 parent 842b57b commit 5c65a32

File tree

3 files changed

+946
-4
lines changed

3 files changed

+946
-4
lines changed

llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -113,9 +113,10 @@ bool RISCVCodeGenPrepare::visitAnd(BinaryOperator &BO) {
113113
// vfredosum.vs v8, v8, v10
114114
// vfmv.f.s fa0, v8
115115
//
116-
// This mainly affects ordered fadd reductions, since other types of reduction
117-
// typically use element-wise vectorisation in the loop body. This tries to
118-
// vectorize any scalar phis that feed into a fadd reduction:
116+
// This mainly affects ordered fadd reductions and VP reductions that have a
117+
// scalar start value, since other types of reduction typically use element-wise
118+
// vectorisation in the loop body. This tries to vectorize any scalar phis that
119+
// feed into these reductions:
119120
//
120121
// loop:
121122
// %phi = phi <float> [ ..., %entry ], [ %acc, %loop ]
@@ -137,7 +138,8 @@ bool RISCVCodeGenPrepare::visitIntrinsicInst(IntrinsicInst &I) {
137138
if (expandVPStrideLoad(I))
138139
return true;
139140

140-
if (I.getIntrinsicID() != Intrinsic::vector_reduce_fadd)
141+
if (I.getIntrinsicID() != Intrinsic::vector_reduce_fadd &&
142+
!isa<VPReductionIntrinsic>(&I))
141143
return false;
142144

143145
auto *PHI = dyn_cast<PHINode>(I.getOperand(0));

0 commit comments

Comments
 (0)