Skip to content

[CodeGen] Poor code size for switch-based tail call dispatch #141542

Open
@resistor

Description

@resistor

In a reduced example involving use of a switch statement to dispatch to a set of tail calls, LLVM inserts identical frame teardown code into every case of the switch statement. Since all of the cases here are nothing but tail calls, the teardown could and should be done in the switch teardown block instead. This happens at all optimization levels, including -Oz, though the details differ a bit. It appears to impact at least RISC-V and AArch64, and at a glance it appears to impact X86-64 as well.

Godbolt example demonstrating the issue on RISCV32 and AArch64: https://godbolt.org/z/sdWreMGxn

Source code, reduced from a real example:

struct bar {};
struct foo {
    unsigned char a;
    bar *b;
};

int f1(foo*, bar*);
int f2(foo*, bar*);
int f3(foo*, bar*);
int f4(foo*, bar*);
int f5(foo*, bar*);
int f6(foo*, bar*);
bar* unseal(bar*);

int func(foo *f) {
    bar* b = unseal(f->b);
    switch (f->a) {
        case 0:
          return f1(f, b);
        case 1:
          return f2(f, b);
        case 2:
          return f3(f, b);
        case 3:
          return f4(f, b);
        case 4:
          return f5(f, b);
        case 5:
          return f6(f, b);
        default:
            return -1;   
    }
}

Snippet from RISCV32 showing redundant frame teardown:

...
.L5:
        ld      s0,0(sp)
        ld      ra,8(sp)
        addi    sp,sp,16
        tail    _Z2f5P3fooP3bar
.L3:
        ld      s0,0(sp)
        ld      ra,8(sp)
        addi    sp,sp,16
        tail    _Z2f6P3fooP3bar
.L9:
        ld      s0,0(sp)
        ld      ra,8(sp)
        addi    sp,sp,16
        tail    _Z2f1P3fooP3bar
...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions