Open

Description
The code generated for this particular function seems quite suboptimal,
pub const fn f(n: u8) -> [u8; 4] {
match n % 4 {
0 => [0x0, 0x1, 0x2, 0x3],
1 => [0x4, 0x5, 0x6, 0x7],
2 => [0x8, 0x9, 0xA, 0xB],
3 => [0xC, 0xD, 0xE, 0xF],
_ => unsafe { std::hint::unreachable_unchecked() }
}
}
From my observations, for all targets, when written as-is above, it emits a switch table and accesses memory.
For x86-64, if the inner arrays are moved into constants, the switch table is removed, and the code is replaced with arithmetic.
Side-by-side comparisons between x86-64 codegen versus armv7-linux-androideabi:
https://godbolt.org/z/ehxabaq38
Here, I was able to manually rewrite the expression into the equivalent of what LLVM emits above:
https://godbolt.org/z/qhfaqEcsf
Nothing else seemed to make the compiler emit the specific codegen.
Unknown as to whether this applies to other output targets.
@rustbot label A-LLVM I-slow
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generationCategory: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.