Description
The following IR is a minimal repro when run with opt -passes=instcombine
:
define i8 @src(<2 x i8> %x, <2 x i8> %y) {
%xsplat = shufflevector <2 x i8> %x, <2 x i8> poison, <2 x i32> <i32 1, i32 0>
%vv = mul <2 x i8> %xsplat, %y
%m = mul <2 x i8> %x, %y ; <<<< Note that %m is later replaced with %vv in the output
%msplat = shufflevector <2 x i8> %m, <2 x i8> poison, <2 x i32> <i32 0, i32 0>
%res = add <2 x i8> %vv, %msplat
%vget_lane = extractelement <2 x i8> %res, i64 1
ret i8 %vget_lane
}
Which on 17.x or trunk will give:
define i8 @src(<2 x i8> %x, <2 x i8> %y) {
%xsplat = shufflevector <2 x i8> %x, <2 x i8> poison, <2 x i32> <i32 poison, i32 0>
%vv = mul <2 x i8> %xsplat, %y
%msplat = shufflevector <2 x i8> %vv, <2 x i8> poison, <2 x i32> <i32 poison, i32 0>
%res = add <2 x i8> %vv, %msplat
%vget_lane = extractelement <2 x i8> %res, i64 1
ret i8 %vget_lane
}
However, this is not correct, because the added poisoned values end up observed in the final output. First, in InstCombine, the first element in each mask for the shuffles are first replaced with poison
due to not being observed in the final extractelement
. After that we run into this code in InstCombineSimplifyDemanded.cpp
:
llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
Lines 1866 to 1880 in d8278b6
This is what replaces the mul <2 x i8> %x, %y
with mul <2 x i8> %xsplat, %y
(m_ZeroMask
also matches poison). However in doing so, it introduces the extra poison from %xsplat
, meaning that %msplat
has both its elements poisoned, which ends up propagating to %vget_lane
.
Here is a Godbolt link showing the test case on 16.x, 17.x, and trunk:
https://godbolt.org/z/5eeT7x4s3
And also an Alive2 example showing the bad transformation with a counter-example:
https://alive2.llvm.org/ce/z/RpBJEN
The code was first introduced in 3b090ff, which a bisect shows is when this first started repro'ing
I have verified that this still repros on latest trunk, 2de269a
For priority/triage: this bug was found by a fuzzer meant to test SIMD codegen. It was not from manually written code