Skip to content

Missing MIR optimization: Replace matches with loads if possible #88793

Open
@pcwalton

Description

@pcwalton

Many times, people write matches on enums that are really loads from a table. It'd be nice if we could codegen them as such.

Here's an example, at https://godbolt.org/z/4P9v61an7:

pub enum Char {
    A,
    B,
    C,
    D,
    E,
    F,
    G,
    H,
    I,
    J,
    K,
    L,
}

pub fn to_str(val: Char) -> &'static str {
    match val {
        Char::A => "A",
        Char::B => "B",
        Char::C => "C",
        Char::D => "D",
        Char::E => "E",
        Char::F => "F",
        Char::G => "G",
        Char::H => "H",
        Char::I => "I",
        Char::J => "J",
        Char::K => "K",
        Char::L => "L",
    }
}

The codegen here has a lot to be desired:

example::to_str:
        lea     rax, [rip + .L__unnamed_1]
        movzx   ecx, dil
        lea     rdx, [rip + .LJTI0_0]
        movsxd  rcx, dword ptr [rdx + 4*rcx]
        add     rcx, rdx
        jmp     rcx
.LBB0_1:
        lea     rax, [rip + .L__unnamed_2]
        mov     edx, 1
        ret
.LBB0_2:
        lea     rax, [rip + .L__unnamed_3]
        mov     edx, 1
        ret
.LBB0_3:
        lea     rax, [rip + .L__unnamed_4]
        mov     edx, 1
        ret
.LBB0_4:
        lea     rax, [rip + .L__unnamed_5]
        mov     edx, 1
        ret
.LBB0_5:
        lea     rax, [rip + .L__unnamed_6]
        mov     edx, 1
        ret
.LBB0_6:
        lea     rax, [rip + .L__unnamed_7]
        mov     edx, 1
        ret
.LBB0_7:
        lea     rax, [rip + .L__unnamed_8]
        mov     edx, 1
        ret
.LBB0_8:
        lea     rax, [rip + .L__unnamed_9]
        mov     edx, 1
        ret
.LBB0_9:
        lea     rax, [rip + .L__unnamed_10]
        mov     edx, 1
        ret
.LBB0_10:
        lea     rax, [rip + .L__unnamed_11]
        mov     edx, 1
        ret
.LBB0_11:
        lea     rax, [rip + .L__unnamed_12]
.LBB0_12:
        mov     edx, 1
        ret
.LJTI0_0:
        .long   .LBB0_12-.LJTI0_0
        .long   .LBB0_1-.LJTI0_0
        .long   .LBB0_2-.LJTI0_0
        .long   .LBB0_3-.LJTI0_0
        .long   .LBB0_4-.LJTI0_0
        .long   .LBB0_5-.LJTI0_0
        .long   .LBB0_6-.LJTI0_0
        .long   .LBB0_7-.LJTI0_0
        .long   .LBB0_8-.LJTI0_0
        .long   .LBB0_9-.LJTI0_0
        .long   .LBB0_10-.LJTI0_0
        .long   .LBB0_11-.LJTI0_0

.L__unnamed_12:
        .byte   76

.L__unnamed_11:
        .byte   75

.L__unnamed_10:
        .byte   74

.L__unnamed_9:
        .byte   73

.L__unnamed_8:
        .byte   72

.L__unnamed_7:
        .byte   71

.L__unnamed_6:
        .byte   70

.L__unnamed_5:
        .byte   69

.L__unnamed_4:
        .byte   68

.L__unnamed_3:
        .byte   67

.L__unnamed_2:
        .byte   66

.L__unnamed_1:
        .byte   65

Deriving Debug can cause poor codegen too: https://godbolt.org/z/xnexGxo8e

There's some discussion on Twitter from LLVM folks that suggests this would be best as an MIR optzn: https://twitter.com/pcwalton/status/1436036809603960835

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-mir-optArea: MIR optimizationsC-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing such

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions