Closed
Description
If a signed integer is divided by a power of two using Euclidean division, it is equivalent to an arithmetic shift, but this is not caught. For example
pub fn foo(a: i32) -> i32 {
a.div_euclid(4)
}
produces the assembly code
example::foo:
mov eax, edi
sar eax, 31
shr eax, 30
add eax, edi
mov ecx, eax
sar ecx, 2
and eax, -4
sub edi, eax
sar edi, 31
lea eax, [rdi + rcx]
ret
while it could simply be
example::foo:
mov eax, edi
sar eax, 2
ret
The LLVM IR is
define i32 @_ZN7example3foo17h28d3a223d67ddccdE(i32 %a) unnamed_addr #0 !dbg !6 {
start:
%q.i = sdiv i32 %a, 4, !dbg !9
%_11.i = srem i32 %a, 4, !dbg !16
%_11.lobit.i = ashr i32 %_11.i, 31, !dbg !18
%.0.i = add nsw i32 %_11.lobit.i, %q.i, !dbg !18
ret i32 %.0.i, !dbg !19
}
attributes #0 = { norecurse nounwind nonlazybind readnone uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
which I interpret to be something like
pub fn foo_ir(a: i32) -> i32 {
let q = a / 4;
let r = a % 4;
let sign_r = if r < 0 { -1 } else { 0 };
sign_r + q
}
(I did check that this function produces the same LLVM IR.)
I don't know LLVM enough to know whether there is some kind of flooring division intrinsic that could be used to optimize this, or whether this a missed optimization in the LLVM side and LLVM can recognize that pattern into an arithmetic shift.
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generationCategory: An issue proposing an enhancement or a PR with one.Category: An issue highlighting optimization opportunities or PRs implementing suchCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Issue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.