Skip to content

Unnecessary on-stack copy when calling Box<FnOnce()> closure #61042

Closed
@pftbest

Description

@pftbest

When calling a boxed closure the captured data is copied from heap to the stack before the call. It happens both with trait object and a concrete type inside the box.

#[inline(never)]
pub fn do_call<F>(b: Box<F>) where F: FnOnce() {
//pub fn do_call(b: Box<dyn FnOnce()>) {
    b();
}

pub unsafe fn foo(large: [u32; 1000]) {
    do_call(
        Box::new(move || {
            dummy(&large);
        })
    );
}

extern "C" {
    fn dummy(data: &[u32; 1000]);
}
example::do_call:
        push    r14
        push    rbx
        sub     rsp, 4008
        mov     rbx, rdi
        lea     r14, [rsp + 8]
        mov     edx, 4000
        mov     rdi, r14
        mov     rsi, rbx
        call    qword ptr [rip + memcpy@GOTPCREL]
        mov     rdi, r14
        call    qword ptr [rip + dummy@GOTPCREL]
        mov     esi, 4000
        mov     edx, 4
        mov     rdi, rbx
        call    qword ptr [rip + __rust_dealloc@GOTPCREL]
        add     rsp, 4008
        pop     rbx
        pop     r14
        ret

Godbolt link: https://godbolt.org/z/srCEBf

I think this extra memcpy is unnecessary, and it should be possible to call boxed closures without moving it out from the heap.

Similar C++ code doesn't do any extra copies in do_call (even when compiled without optimizations -O0), godbolt link: https://godbolt.org/z/VY8OLn

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-closuresArea: Closures (`|…| { … }`)C-enhancementCategory: An issue proposing an enhancement or a PR with one.F-unsized_locals`#![feature(unsized_locals)]`I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions