Description
There are two "capacity"s for any successful allocation: the requested capacity and the returned capacity. The Allocator
trait explicitly allows these to not be the same. However, it appears that RawVec
assumes they are the same.
Everywhere RawVec
request an allocation, it then sets its internal cap
field for the capacity to be equal to the requested capacity, not the returned capacity The canonical example of this is here, when a new RawVec
is constructed with a given capacity (though the same behaviour also applies when growing allocation). The comment even states
// Allocators currently return a
NonNull<[u8]>
whose length
// matches the size requested. If that ever changes, the capacity
// here should change toptr.len() / mem::size_of::<T>()
.
which directly contradicts the docs for the Allocator
trait.
In principle, we could design RawVec
to only store and report the requested capacity and that could be an entirely sound design. However, I worry that the assumption embodied by this comment has wormed its way throughout the standard library and probably the ecosystem. For example, if breaking apart a Vec
-based structure to send for FFI, the docs until recently strongly suggested that Vec::capacity
would return the allocated capacity (which is only true if the allocator always returns the requested capacity). In #99790 the docs are currently being updated to reflect that this isn't the case.
Another example of where this is problematic is Vec::to_raw_parts
which clearly assumes that RawVec::cap
is the allocator returned capacity. I worry that this assumption is implicitly present through a lot of code.
I suggest therefore
- Fix
RawVec
s allocation methods to store the capacity returned from the allocator, unfortunately requiring an extra division to do so - Review the documentation of
RawVec
and structures that use it to ensure that it is clear thatcapacity()
may return more than what was requested, and associated methods - Looking forward: discuss whether we should guarantee that
capacity()
returns exactly the allocator returned capacity, and for which datastructures.
The alternative that I see is to instead commit to RawVec::cap
being the requested capacity. In which case, it would be good to review where this may have been assumed elsewhere and what might need to be fixed. Happy to hear views on whether people think this would be a better path.