Description
some example code:
const LUT: [u8; 2] = [
1,
1,
];
pub fn decode(i: u8) -> u8 {
if i < 2 {
LUT[i as usize]
} else {
2
}
}
compiles to (via 1.66.1, or 1.68.0-nightly from 2023-01-21):
example::decode:
mov al, 2
cmp dil, 1
ja .LBB0_2
movzx eax, dil
lea rcx, [rip + .L__unnamed_1]
movzx eax, byte ptr [rax + rcx]
.LBB0_2:
ret
.L__unnamed_1:
.zero 2,1
rustc faithfully produces an array of 0x01, 0x01
in the resulting program and loads from it. it seems to me rustc ought to be able to tell that all entries in the array are identical, use that value, and avoid loading from the array in the first place.
i'm not entirely sure if this should be an issue against rust-lang/rust
or bug against LLVM, but i tried looking through I-slow
for similar reports. so i think starting here is the right call :D thanks!
edit: it does seem remarkable that the load is eliminated if there is only one item in the array. that looks to be handled somewhere before generating LLVM IR, so i think this is the right place.
i happened to notice this with LUT being an array of function pointers, included for completeness
const LUT: [fn() -> u8; 2] = [
|| { 1 },
|| { 1 },
];
pub fn decode(i: u8) -> u8 {
if i < 2 {
LUT[i as usize]()
} else {
2
}
}
compiles to (via 1.66.1, or 1.68.0-nightly from 2023-01-21):
core::ops::function::FnOnce::call_once:
mov al, 1
ret
playground::decode:
cmp dil, 1
ja .LBB1_1
movzx eax, dil
lea rcx, [rip + .L__unnamed_1]
jmp qword ptr [rcx + 8*rax]
.LBB1_1:
mov al, 2
ret
.L__unnamed_1:
.quad core::ops::function::FnOnce::call_once
.quad core::ops::function::FnOnce::call_once
similarly, the array is indexed regardless of the fact that the array's items are all identical. this, in turn, is a minimized form of the original code with dozens of entries in the array. in that case, LUT
is an associated item on a trait where one implementation happens to result in all function pointers being the same nearly-no-op function.