Skip to content

In-place iteration results in too big allocations #120091

Closed
@TethysSvensson

Description

@TethysSvensson

(This bug report was inspired by this blog post https://blog.polybdenum.com/2024/01/17/identifying-the-collect-vec-memory-leak-footgun.html)

After #110353 was landed, in-place iteration can reuse allocations in many more places. While this is a good thing, in some cases this can cause overly large capacity for the destination vector.

Additionally, in some this will cause the destination vector to have a very large capacity even for non-shared allocations. In my opinion, this should never happen.

For an example see this code:

fn dbg_vec<T>(v: &Vec<T>) {
    println!(
        "vec data ptr={:?} len={} cap={}",
        v.as_ptr(),
        v.len(),
        v.capacity()
    );
}

fn main() {
    let v1 = (0u16..128).map(|i| [i; 1024]).collect::<Vec<_>>();
    dbg_vec(&v1);
    let v2 = v1.into_iter().map(|x| x[0] as u8).collect::<Vec<_>>();
    dbg_vec(&v2);
}

On stable this code works as expected, i.e. two different vectors with reasonably lengths and capacities.

On beta however, v2 will have a capacity of 262144, even though it does not share an allocation with v1. If you remove the as u8 part, then the allocations will be shared, but the capacity is still overly large.

My suggested fix is:

  • Do not attempt to use in-place collection if the destination alignments do not match, as reallocation will always be necessary. On my machine, these reallocations do not appear to return the samme allocation, so we might as well allocate up front to avoid a memcpy.
  • The behavior regarding capacities in the remaining cases needs to either change or be documented somewhere.

Meta

I am running NixOS with rustup.

$ rustup --version --verbose
rustup 1.26.0 (1980-01-01)
info: This is the version for the rustup toolchain manager, not the rustc compiler.
info: The currently active `rustc` version is `rustc 1.77.0-nightly (6ae4cfbbb 2024-01-17)`
$ rustc +stable --version --verbose
rustc 1.75.0 (82e1608df 2023-12-21)
binary: rustc
commit-hash: 82e1608dfa6e0b5569232559e3d385fea5a93112
commit-date: 2023-12-21
host: x86_64-unknown-linux-gnu
release: 1.75.0
LLVM version: 17.0.6
$ rustc +beta --version --verbose
rustc 1.76.0-beta.5 (f732c37b4 2024-01-12)
binary: rustc
commit-hash: f732c37b4175158d3af9e2e156142ffb0bff8969
commit-date: 2024-01-12
host: x86_64-unknown-linux-gnu
release: 1.76.0-beta.5
LLVM version: 17.0.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-collectionsArea: `std::collections`C-discussionCategory: Discussion or questions that doesn't represent real issues.T-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions