Description
The current version of LLVM (June 4th 2024 5ae5774), asserts when you have an operand of a phi with a constant value of type <1 x bfloat>
and the target has TypeSoftPromoteHalf
as the legalization strategy for bfloat
.
- To reproduce:
Input IR:
define void @phi_broken(ptr %dst) #0 {
bb:
br label %bb19
bb19:
%i20 = phi <1 x bfloat> [ zeroinitializer, %bb ]
%i25 = shufflevector <1 x bfloat> poison, <1 x bfloat> %i20, <2 x i32> <i32 0, i32 1>
store <2 x bfloat> %i25, ptr %dst
ret void
}
attributes #0 = { noinline optnone }
Run command:
llc -march x86-64 input.ll
- Result
Assertion failed: (VT.isInteger() && N1.getValueType().isInteger() && "Invalid ANY_EXTEND!"), function getNode, file SelectionDAG.cpp, line 6013.
Note: It blows up in the same fashion with amdgcn
for instance. This is a generic issue. See the details.
Details
What happens is with TypeSoftPromoteHalf
, bfloat
values are mapped to f32
registers. This means that when <1 x bfloat>
gets lowered it first gets scalarized to this type through:
%out(f32) = extract_vector_elt <1 x bfloat> %in, i32 0
This is invalid SDISel IR because EXTRACT_VECTOR_ELT
is only supposed to allow type extensions for integer types (
extract_vector_elt
to i16
followed by bf16_to_fp
.)
However, when the <1 x bfloat>
is a constant, the SelectionDAGBuilder
tries to constant fold the extract_vector_elt
directly with a getAnyExtOrTrunc
and this blows up since we are dealing with a floating-point value, not an integer one.
I believe it would be safer to issue something like:
t6: bf16 = extract_vector_elt t5, Constant:i64<0>
t9: f32 = fp_extend t6
Which is what we produce for (non-constant) <2 x bfloat>
.
In other words, we are probably too smart for our own good.