Closed
Description
The add of two splats is successfully scalarized here when the element type is i64 on RV64:
define <vscale x 1 x i64> @f_nxv1i64(<vscale x 1 x i64> %x, i64 %y) {
%1 = insertelement <vscale x 1 x i64> poison, i64 %y, i32 0
%2 = shufflevector <vscale x 1 x i64> %1, <vscale x 1 x i64> poison, <vscale x 1 x i32> zeroinitializer
%3 = add <vscale x 1 x i64> %2, shufflevector(<vscale x 1 x i64> insertelement(<vscale x 1 x i64> poison, i64 42, i32 0), <vscale x 1 x i64> poison, <vscale x 1 x i32> zeroinitializer)
%4 = mul <vscale x 1 x i64> %x, %3
ret <vscale x 1 x i64> %4
}
With llc -mtriple=riscv64 -mattr=+v
f_nxv1i64:
addi a0, a0, 42
vsetvli a1, zero, e64, m1, ta, ma
vmul.vx v8, v8, a0
ret
If the element type is not a legal scalar type though, e.g. i8, it doesn't get scalarized:
define <vscale x 8 x i8> @f_nxv8i8(<vscale x 8 x i8> %x, i8 %y) {
%1 = insertelement <vscale x 8 x i8> poison, i8 %y, i32 0
%2 = shufflevector <vscale x 8 x i8> %1, <vscale x 8 x i8> poison, <vscale x 8 x i32> zeroinitializer
%3 = add <vscale x 8 x i8> %2, shufflevector(<vscale x 8 x i8> insertelement(<vscale x 8 x i8> poison, i8 42, i32 0), <vscale x 8 x i8> poison, <vscale x 8 x i32> zeroinitializer)
%4 = mul <vscale x 8 x i8> %x, %3
ret <vscale x 8 x i8> %4
}
f_nxv8i8:
vsetvli a1, zero, e8, m1, ta, ma
vmv.v.x v9, a0
li a0, 42
vadd.vx v9, v9, a0
vmul.vv v8, v8, v9
ret
For certain operations like add, it should be ok to scalarize it into an i64 (or i32 on RV32). It's not safe for all operations though, e.g. it wouldn't be correct to move a ISD:UADDSAT from a v4i8 in an i64 (at least not without properly promoting it first)