Open
Description
As discussed on https://reviews.llvm.org/D151358, the VE backend generates inefficient code for something like the following:
target triple = "ve"
define i64 @f(i64 %x, i64 %y, i128 %a, i128 %b) {
%c = icmp ugt i128 %a, %b
%d = select i1 %c, i64 %y, i64 %x
ret i64 %d
}
This currently generates 8 instructions, but it can be done in 4 instructions. We currently use 4 extra instructions to translate the result of cmpu into a boolean.
With D151358, this also impacts the efficiency of i128
smin
/smax
/umin
/umax
.