Skip to content

Extra null check on as_bytes() iteration with break #25306

Closed
@raphlinus

Description

@raphlinus

Code generation is less than ideal for a function which iterates over a string as_bytes() and has a break; there is a null check of the pointer to the bytes before loading.

Here's the function and the asm:

pub fn find_nl(s: &str) -> usize {
    let mut i = 0;
    for &c in s.as_bytes() {
        if c == b'\n' { break; }
        i += 1;
    }
    i
}
.LBB0_4:
    movq    %rdx, %rsi
    addq    %rax, %rsi
    je  .LBB0_7
    movzbl  (%rdx,%rax), %esi
    cmpl    $10, %esi
    je  .LBB0_7
    incq    %rax
    cmpq    %rax, %rcx
    jne .LBB0_4
.LBB0_7:

The function and two other versions (which generate better code) are at http://is.gd/E4i7z7. A version which uses enumerate doesn't have the null check, but has an extra movq. A version which uses position rather than writing the loop generates perfect code. Just for context, this issue arose while writing a Markdown parser. The current code uses position(), and the speedup was significant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: Code generation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions