Skip to content

missed(?) optimization with a const array of same item #107208

Closed
@iximeow

Description

@iximeow

some example code:

const LUT: [u8; 2] = [
    1,
    1,
];

pub fn decode(i: u8) -> u8 {
    if i < 2 {
        LUT[i as usize]
    } else {
        2
    }
}

compiles to (via 1.66.1, or 1.68.0-nightly from 2023-01-21):

example::decode:
        mov     al, 2
        cmp     dil, 1
        ja      .LBB0_2
        movzx   eax, dil
        lea     rcx, [rip + .L__unnamed_1]
        movzx   eax, byte ptr [rax + rcx]
.LBB0_2:
        ret

.L__unnamed_1:
        .zero   2,1

rustc faithfully produces an array of 0x01, 0x01 in the resulting program and loads from it. it seems to me rustc ought to be able to tell that all entries in the array are identical, use that value, and avoid loading from the array in the first place.

i'm not entirely sure if this should be an issue against rust-lang/rust or bug against LLVM, but i tried looking through I-slow for similar reports. so i think starting here is the right call :D thanks!

edit: it does seem remarkable that the load is eliminated if there is only one item in the array. that looks to be handled somewhere before generating LLVM IR, so i think this is the right place.

i happened to notice this with LUT being an array of function pointers, included for completeness
const LUT: [fn() -> u8; 2] = [
    || { 1 },
    || { 1 },
];

pub fn decode(i: u8) -> u8 {
    if i < 2 {
        LUT[i as usize]()
    } else {
        2
    }
}

compiles to (via 1.66.1, or 1.68.0-nightly from 2023-01-21):

core::ops::function::FnOnce::call_once:
	mov	al, 1
	ret

playground::decode:
	cmp	dil, 1
	ja	.LBB1_1
	movzx	eax, dil
	lea	rcx, [rip + .L__unnamed_1]
	jmp	qword ptr [rcx + 8*rax]

.LBB1_1:
	mov	al, 2
	ret

.L__unnamed_1:
	.quad	core::ops::function::FnOnce::call_once
	.quad	core::ops::function::FnOnce::call_once

similarly, the array is indexed regardless of the fact that the array's items are all identical. this, in turn, is a minimized form of the original code with dozens of entries in the array. in that case, LUT is an associated item on a trait where one implementation happens to result in all function pointers being the same nearly-no-op function.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-arrayArea: `[T; N]`E-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions