Description
(This bug report was inspired by this blog post https://blog.polybdenum.com/2024/01/17/identifying-the-collect-vec-memory-leak-footgun.html)
After #110353 was landed, in-place iteration can reuse allocations in many more places. While this is a good thing, in some cases this can cause overly large capacity for the destination vector.
Additionally, in some this will cause the destination vector to have a very large capacity even for non-shared allocations. In my opinion, this should never happen.
For an example see this code:
fn dbg_vec<T>(v: &Vec<T>) {
println!(
"vec data ptr={:?} len={} cap={}",
v.as_ptr(),
v.len(),
v.capacity()
);
}
fn main() {
let v1 = (0u16..128).map(|i| [i; 1024]).collect::<Vec<_>>();
dbg_vec(&v1);
let v2 = v1.into_iter().map(|x| x[0] as u8).collect::<Vec<_>>();
dbg_vec(&v2);
}
On stable this code works as expected, i.e. two different vectors with reasonably lengths and capacities.
On beta however, v2
will have a capacity of 262144, even though it does not share an allocation with v1
. If you remove the as u8
part, then the allocations will be shared, but the capacity is still overly large.
My suggested fix is:
- Do not attempt to use in-place collection if the destination alignments do not match, as reallocation will always be necessary. On my machine, these reallocations do not appear to return the samme allocation, so we might as well allocate up front to avoid a memcpy.
- The behavior regarding capacities in the remaining cases needs to either change or be documented somewhere.
Meta
I am running NixOS with rustup.
$ rustup --version --verbose
rustup 1.26.0 (1980-01-01)
info: This is the version for the rustup toolchain manager, not the rustc compiler.
info: The currently active `rustc` version is `rustc 1.77.0-nightly (6ae4cfbbb 2024-01-17)`
$ rustc +stable --version --verbose
rustc 1.75.0 (82e1608df 2023-12-21)
binary: rustc
commit-hash: 82e1608dfa6e0b5569232559e3d385fea5a93112
commit-date: 2023-12-21
host: x86_64-unknown-linux-gnu
release: 1.75.0
LLVM version: 17.0.6
$ rustc +beta --version --verbose
rustc 1.76.0-beta.5 (f732c37b4 2024-01-12)
binary: rustc
commit-hash: f732c37b4175158d3af9e2e156142ffb0bff8969
commit-date: 2024-01-12
host: x86_64-unknown-linux-gnu
release: 1.76.0-beta.5
LLVM version: 17.0.6