Skip to content

Inefficient codegen for checking top bits of a 64-bit integer #62145

Open
@dzaima

Description

@dzaima

The code

#include<stdint.h>
#include<stdbool.h>
void ext1(void);
void ext2(void);
static bool check(uint64_t x) {
    return (x>>48) == 0xfff7;
}
void f(uint64_t a, uint64_t b) {
    if (check(a)) ext1();
    ext2();
    if (check(b)) ext1();
}

compiles (x86-64, -O3; compiler explorer) to

f:                                      # @f
        push    r15
        push    r14
        push    rbx
        mov     rbx, rsi
        movabs  r14, -281474976710656
        movabs  r15, -2533274790395904
        shr     rdi, 48
        cmp     edi, 65527
        jne     .LBB0_2
        call    ext1@PLT
.LBB0_2:
        call    ext2@PLT
        and     rbx, r14
        cmp     rbx, r15
        jne     .LBB0_3
        pop     rbx
        pop     r14
        pop     r15
        jmp     ext1@PLT                        # TAILCALL
.LBB0_3:
        pop     rbx
        pop     r14
        pop     r15
        ret

which uses (x>>48)==0xfff7 for one check call, and (x&0xffff000000000000) == 0xfff7000000000000 for the other. This ends up loading those large constants within the function prologue (which increases the prologue/epilogue even more due to pushing out non-volatile registers), but only uses them for one of the cases (meaning that those constants don't even need to be loaded during the prologue).

Simpler example, but without the prologue issue: compiler explorer

void f(uint64_t a, uint64_t b) {
    if (check(a) || check(b)) ext1();
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions