Skip to content

Missed optimization in i.div_euclid(power_of_two) #71096

Closed
@tspiteri

Description

@tspiteri

If a signed integer is divided by a power of two using Euclidean division, it is equivalent to an arithmetic shift, but this is not caught. For example

pub fn foo(a: i32) -> i32 {
    a.div_euclid(4)
}

produces the assembly code

example::foo:
        mov     eax, edi
        sar     eax, 31
        shr     eax, 30
        add     eax, edi
        mov     ecx, eax
        sar     ecx, 2
        and     eax, -4
        sub     edi, eax
        sar     edi, 31
        lea     eax, [rdi + rcx]
        ret

while it could simply be

example::foo:
        mov     eax, edi
        sar     eax, 2
        ret

The LLVM IR is

define i32 @_ZN7example3foo17h28d3a223d67ddccdE(i32 %a) unnamed_addr #0 !dbg !6 {
start:
  %q.i = sdiv i32 %a, 4, !dbg !9
  %_11.i = srem i32 %a, 4, !dbg !16
  %_11.lobit.i = ashr i32 %_11.i, 31, !dbg !18
  %.0.i = add nsw i32 %_11.lobit.i, %q.i, !dbg !18
  ret i32 %.0.i, !dbg !19
}

attributes #0 = { norecurse nounwind nonlazybind readnone uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

which I interpret to be something like

pub fn foo_ir(a: i32) -> i32 {
    let q = a / 4;
    let r = a % 4;
    let sign_r = if r < 0 { -1 } else { 0 };
    sign_r + q
}

(I did check that this function produces the same LLVM IR.)

I don't know LLVM enough to know whether there is some kind of flooring division intrinsic that could be used to optimize this, or whether this a missed optimization in the LLVM side and LLVM can recognize that pattern into an arithmetic shift.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions