Closed
Description
Overview: When the multiplication is nuw
and B
& C
are constants, transform
x * B < C
→a < ⌈C / B⌉
x * B ≤ C
→a ≤ ⌊C / B⌋
Validity proof for a specific example, as a demonstration: https://alive2.llvm.org/ce/z/H-W9Zb
define i64 @src(i64 %x) denormal-fp-math=ieee,ieee {
%start:
%_4.i = zext i64 %x to i128
%p.i = mul nuw i128 %_4.i, 5
%_8.i = icmp ult i128 %p.i, 9223372036854775808
%0 = trunc i128 %p.i to i64
%.0.i = select i1 %_8.i, i64 %0, i64 -1
ret i64 %.0.i
}
=>
define i64 @tgt(i64 %x) denormal-fp-math=ieee,ieee {
%start:
%_4.i = zext i64 %x to i128
%_8.i = icmp ult i128 %_4.i, 1844674407370955162
%p.i = mul nuw i128 %_4.i, 5
%0 = trunc i128 %p.i to i64
%.0.i = select i1 %_8.i, i64 %0, i64 -1
ret i64 %.0.i
}
Transformation seems to be correct!
This is particularly worth doing in that example, since other existing transformations can then remove the zext
+trunc
:
define i64 @tgt(i64 %x) unnamed_addr #0 {
%_8.i = icmp ult i64 %x, 1844674407370955162
%p.i = mul i64 %x, 5
%.0.i = select i1 %_8.i, i64 %p.i, i64 -1
ret i64 %.0.i
}
And that means much tidier output overall:
src: # @src
mov rax, rdi
mov ecx, 5
mul rcx
movabs rcx, -9223372036854775808
cmp rax, rcx
sbb rdx, 0
mov rcx, -1
cmovae rax, rcx
ret
tgt: # @tgt
movabs rax, 1844674407370955162
cmp rdi, rax
lea rcx, [rdi + 4*rdi]
mov rax, -1
cmovb rax, rcx
ret
(This is a sibling issue to #56562, which discusses a similar thing for umul.with.overflow
)
EDIT: Updated to not have this need to care about the zext
-- other existing things will handle that already.