Open
Description
I tried this code, on riscv32i-unknown-none-elf
(this code is originally from #136216):
#![no_std]
pub enum Idx { _0, _1, _2, _3 }
#[no_mangle]
pub unsafe fn lookup_local(x: Idx) -> u8 {
[0xEF, 0x0E, 0x0D, 0xBD][x as usize]
}
#[no_mangle]
pub unsafe fn lookup_const(x: Idx) -> u8 {
(const { [0xEF, 0x0E, 0x0D, 0xBD] })[x as usize]
}
I expected to see this happen: compiler outputs identical asm for both functions.
Instead, this happened: the asm for lookup_local
is suboptimal (godbolt):
lookup_local:
addi sp, sp, -16
li a1, -17
sb a1, 12(sp)
li a1, 14
sb a1, 13(sp)
li a1, 13
sb a1, 14(sp)
li a1, 189
sb a1, 15(sp)
addi a1, sp, 12
add a0, a1, a0
lbu a0, 0(a0)
addi sp, sp, 16
ret
lookup_const:
lui a1, %hi(.L__unnamed_1)
addi a1, a1, %lo(.L__unnamed_1)
add a0, a1, a0
lbu a0, 0(a0)
ret
.L__unnamed_1:
.ascii "\357\016\r\275"
Instead of a using constant, lookup_local
constructs the array on the stack.
This might be a missed optimization in the LLVM, but we could also potentially handle it in rustc (and personally I feel like the latter might be easier).
Meta
rustc --version --verbose
:
rustc 1.86.0-nightly (2f348cb7c 2025-01-27)
binary: rustc
commit-hash: 2f348cb7ce4063fa4eb40038e6ada3c5214717bd
commit-date: 2025-01-27
host: x86_64-unknown-linux-gnu
release: 1.86.0-nightly
LLVM version: 19.1.7
above is the code from godbolt, locally I get slightly different output
rustc +nightly ./t.rs --target riscv32i-unknown-none-elf -Copt-level=3 -Zexport-executable-symbols
, objdump --disassemble t
#![no_std]
#![no_main]
pub enum Idx {
_0,
_1,
_2,
_3,
}
#[no_mangle]
#[inline(never)]
pub unsafe fn lookup_local(x: Idx) -> u8 {
[0xEF, 0x0E, 0x0D, 0xBD][x as usize]
}
#[inline(never)]
#[no_mangle]
pub unsafe fn lookup_const(x: Idx) -> u8 {
(const { [0xEF, 0x0E, 0x0D, 0xBD] })[x as usize]
}
#[panic_handler]
fn disco_loop(_: &core::panic::PanicInfo) -> ! {
loop {}
}
000110d8 <lookup_local>:
110d8: ff010113 addi sp, sp, -0x10
110dc: fef00593 li a1, -0x11
110e0: 00b10623 sb a1, 0xc(sp)
110e4: 00e00593 li a1, 0xe
110e8: 00b106a3 sb a1, 0xd(sp)
110ec: 00d00593 li a1, 0xd
110f0: 00b10723 sb a1, 0xe(sp)
110f4: 0bd00593 li a1, 0xbd
110f8: 00b107a3 sb a1, 0xf(sp)
110fc: 00c10593 addi a1, sp, 0xc
11100: 00a58533 add a0, a1, a0
11104: 00054503 lbu a0, 0x0(a0)
11108: 01010113 addi sp, sp, 0x10
1110c: 00008067 ret
00011110 <lookup_const>:
11110: 000105b7 lui a1, 0x10
11114: 0d458593 addi a1, a1, 0xd4
11118: 00a58533 add a0, a1, a0
1111c: 00054503 lbu a0, 0x0(a0)
11120: 00008067 ret
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generationCategory: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.