Skip to content

u32::from_be_bytes(*bytes) generates suboptimal code on riscv #88852

Closed
@xobs

Description

@xobs

I tried this code:

pub fn from_be_slice_manual(bytes: &[u8; 4]) -> u32 {
    (bytes[0] as u32) << 24
    | ((bytes[1] as u32) << 16)
    | ((bytes[2] as u32) << 8)
    | (bytes[3] as u32)
}

pub fn from_be_slice_intrinsic(bytes: &[u8; 4]) -> u32 {
    u32::from_be_bytes(*bytes)
}

I expected to see this happen: Both should produce equal output, or at the very least the one using the intrinsic should be acceptable. When building on x86 these produce the same output, and on arm-unknown-linux-gnueabi the output is different but not terrible. On riscv64gc-unknown-linux-gnu the asm generated by the intrinsic is massive.

Instead, this happened:

example::from_be_slice_manual:
        lb      a1, 0(a0)
        lbu     a2, 1(a0)
        slli    a1, a1, 24
        lbu     a3, 2(a0)
        slli    a2, a2, 16
        lbu     a0, 3(a0)
        or      a1, a1, a2
        slli    a2, a3, 8
        or      a1, a1, a2
        or      a0, a0, a1
        ret

example::from_be_slice_intrinsic:
        lbu     a1, 1(a0)
        lbu     a2, 0(a0)
        lb      a3, 3(a0)
        lbu     a0, 2(a0)
        slli    a1, a1, 8
        or      a1, a1, a2
        slli    a2, a3, 8
        or      a0, a0, a2
        slli    a0, a0, 16
        or      a0, a0, a1
        slli    a1, a0, 8
        addi    a2, zero, 255
        slli    a3, a2, 32
        and     a1, a1, a3
        slli    a3, a0, 24
        slli    a4, a2, 40
        and     a3, a3, a4
        or      a1, a1, a3
        slli    a3, a0, 40
        slli    a2, a2, 48
        and     a2, a2, a3
        slli    a0, a0, 56
        or      a0, a0, a2
        or      a0, a0, a1
        srli    a0, a0, 32
        ret

Meta

rustc --version --verbose:

rustc 1.55.0
1
rustc 1.55.0 - 786ms

rustc 1.55.0 (c8dfcfe04 2021-09-06)
binary: rustc
commit-hash: c8dfcfe046a7680554bf4eb612bad840e7631c4b
commit-date: 2021-09-06
host: x86_64-unknown-linux-gnu
release: 1.55.0
LLVM version: 12.0.1

Godbolt link: https://godbolt.org/z/aPPdnond5

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationC-bugCategory: This is a bug.I-heavyIssue: Problems and improvements with respect to binary size of generated code.O-riscvTarget: RISC-V architectureT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions