Skip to content

Missed match optimization of riscv #136216

Open
@WaffleLapkin

Description

@WaffleLapkin

I was code golfing different implementations for a function that "inverts" (rotates by 2) an enum with discriminants 1, 2, 4, 8 (i.e. f(1) = 4), the obvious implementation being this:

pub enum Side {
    Top = 1 << 0,
    Right = 1 << 1,
    Bottom = 1 << 2,
    Left = 1 << 3,
}

pub fn opposite_match(x: Side) -> Side {
    use Side::*;
    match x {
        Top => Bottom,
        Right => Left,
        Bottom => Top,
        Left => Right,
    }
}

I expected to see this happen: on riscv opposite_match produces the optimal code, or at least a reasonable one.

Instead, this happened: compiler generates the following asm:

opposite_match:
        addi    a0, a0, -1
        slli    a0, a0, 24
        srai    a0, a0, 24
        lui     a1, %hi(.Lswitch.table.opposite_match)
        addi    a1, a1, %lo(.Lswitch.table.opposite_match)
        add     a0, a1, a0
        lbu     a0, 0(a0)

example::table_lookup::T::h07c70895e308d45b:
        .ascii  "\004\b\004\001\004\004\004\002"

The problem is the slli (shift left logical immediate) and srai (shift right arithmetic immediate) instructions -- those together are noop, since a0 <= 7 at that point and 7 << 24 < 1.rotate_right(1).

The underlying issue is that llvm at some point replaces indexing by u8 with indexing by u32, inserting a sext in the process; sext is not getting optimized out afterwards and is later lowered to ssli+srai. LLVM issue: llvm/llvm-project#124841.

If you write the lookup table by hand compiler generates better asm:

pub fn table_lookup(x: Side) -> Side {
    static T: [Side; 8] = [
        Side::Bottom, // <--
        Side::Left, // <--
        Side::Bottom,
        Side::Top, // <--
        Side::Bottom,
        Side::Bottom,
        Side::Bottom,
        Side::Right, // <--
    ];
    T[x as usize - 1]
}
example::table_lookup::h8d56712109e87652:
        lui     a1, %hi(example::table_lookup::T::h07c70895e308d45b)
        addi    a1, a1, %lo(example::table_lookup::T::h07c70895e308d45b)
        add     a0, a1, a0
        lbu     a0, -1(a0)
        ret

example::table_lookup::T::h07c70895e308d45b:
        .ascii  "\004\b\004\001\004\004\004\002"

There are no shifts and the -1 is merged into the the lbu (load byte unsigned).

Godbolt link.

Meta

rustc version:

1.86.0-nightly (2025-01-27 2f348cb7ce4063fa4eb4)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationC-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions