Description
This occurs both on the latest stable and nightly with both -C opt-level=2
and -C opt-level=3
, and looking back some versions this has been going on for quite a while, so this is not a regression AFAIK. Consider the following piece of code:
#[no_mangle]
pub fn test(x: &u8) -> bool {
let load = *x;
if load <= 1 {
return load == 1;
}
// Just some nonsense to keep the above branch.
something_dynamic()
}
#[inline(never)]
#[cold]
fn something_dynamic() -> bool {
std::env::var("dynamic").is_ok()
}
I would expect that in a release build, the above code loads the u8, checks if it is a legal value for a bool, and if so return it, otherwise going to something_dynamic()
. The motivation for this piece of code is something similar to a OnceLock<bool>
in an incredibly hot piece of code. Ideally the hot path only consists of a single mov
, cmp
and jmp
. Compiled we see the following:
test:
movzx eax, byte ptr [rdi]
cmp al, 2
jae example::something_dynamic
cmp al, 1
sete al
ret
example::something_dynamic:
<omitted>
It appears that Rust always emits code for turning an u8
into a bool
(cmp al, 1
followed by sete al
), regardless of whether this is necessary. In this case, since we checked that the u8
is <= 1
, this is not necessary at all. The same problem occurs on ARM:
_test:
ldrb w8, [x0]
cmp w8, #2
b.hs LBB0_2
cmp w8, #1
cset w0, eq
ret
LBB0_2:
b example::something_dynamic
The problem even persists if we try to use transmute, or pointer casts to bypass the problem. All the following variants still cause a superfluous value test:
if load <= 1 {
return load == 1;
// return unsafe { *(&load as *const u8 as *const bool) };
// return unsafe { std::mem::transmute(load) };
// return unsafe { std::mem::transmute_copy(&load) };
}
The weirdest thing is that this pessimization only occurs when there is a bounds check already applied. The following function avoids a test, directly loading the byte:
#[no_mangle]
unsafe fn maybe_load(x: &u8, should_load: bool) -> bool {
if should_load {
return unsafe { std::mem::transmute_copy(x) };
}
something_dynamic()
}
maybe_load:
test esi, esi
je example::something_dynamic
movzx eax, byte ptr [rdi]
ret
However, if you try to implement test
in terms of maybe_load
, it gets inlined and the useless cmp, sete
is back.