Skip to content

Inefficient code generation for u8 -> boolean conversion if bounds check already applied #121673

Open
@orlp

Description

@orlp

This occurs both on the latest stable and nightly with both -C opt-level=2 and -C opt-level=3, and looking back some versions this has been going on for quite a while, so this is not a regression AFAIK. Consider the following piece of code:

#[no_mangle]
pub fn test(x: &u8) -> bool {
    let load = *x;
    if load <= 1 {
        return load == 1;
    }

    // Just some nonsense to keep the above branch.
    something_dynamic()
}

#[inline(never)]
#[cold]
fn something_dynamic() -> bool {
    std::env::var("dynamic").is_ok()
}

I would expect that in a release build, the above code loads the u8, checks if it is a legal value for a bool, and if so return it, otherwise going to something_dynamic(). The motivation for this piece of code is something similar to a OnceLock<bool> in an incredibly hot piece of code. Ideally the hot path only consists of a single mov, cmp and jmp. Compiled we see the following:

test:
        movzx   eax, byte ptr [rdi]
        cmp     al, 2
        jae     example::something_dynamic
        cmp     al, 1
        sete    al
        ret

example::something_dynamic:
        <omitted>

It appears that Rust always emits code for turning an u8 into a bool (cmp al, 1 followed by sete al), regardless of whether this is necessary. In this case, since we checked that the u8 is <= 1, this is not necessary at all. The same problem occurs on ARM:

_test:
        ldrb    w8, [x0]
        cmp     w8, #2
        b.hs    LBB0_2
        cmp     w8, #1
        cset    w0, eq
        ret
       
LBB0_2:
        b       example::something_dynamic

The problem even persists if we try to use transmute, or pointer casts to bypass the problem. All the following variants still cause a superfluous value test:

    if load <= 1 {
        return load == 1;
        // return unsafe { *(&load as *const u8 as *const bool) };
        // return unsafe { std::mem::transmute(load) };
        // return unsafe { std::mem::transmute_copy(&load) };
    }

The weirdest thing is that this pessimization only occurs when there is a bounds check already applied. The following function avoids a test, directly loading the byte:

#[no_mangle]
unsafe fn maybe_load(x: &u8, should_load: bool) -> bool {
    if should_load {
        return unsafe { std::mem::transmute_copy(x) };
    }

    something_dynamic()
}
maybe_load:
        test    esi, esi
        je      example::something_dynamic
        movzx   eax, byte ptr [rdi]
        ret

However, if you try to implement test in terms of maybe_load, it gets inlined and the useless cmp, sete is back.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchS-has-mcveStatus: A Minimal Complete and Verifiable Example has been found for this issueT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions