Skip to content

RawVec sets capacity incorrectly, causing potentially many bugs #116580

Open
@jmaargh

Description

@jmaargh

There are two "capacity"s for any successful allocation: the requested capacity and the returned capacity. The Allocator trait explicitly allows these to not be the same. However, it appears that RawVec assumes they are the same.

Everywhere RawVec request an allocation, it then sets its internal cap field for the capacity to be equal to the requested capacity, not the returned capacity The canonical example of this is here, when a new RawVec is constructed with a given capacity (though the same behaviour also applies when growing allocation). The comment even states

// Allocators currently return a NonNull<[u8]> whose length
// matches the size requested. If that ever changes, the capacity
// here should change to ptr.len() / mem::size_of::<T>().

which directly contradicts the docs for the Allocator trait.

In principle, we could design RawVec to only store and report the requested capacity and that could be an entirely sound design. However, I worry that the assumption embodied by this comment has wormed its way throughout the standard library and probably the ecosystem. For example, if breaking apart a Vec-based structure to send for FFI, the docs until recently strongly suggested that Vec::capacity would return the allocated capacity (which is only true if the allocator always returns the requested capacity). In #99790 the docs are currently being updated to reflect that this isn't the case.

Another example of where this is problematic is Vec::to_raw_parts which clearly assumes that RawVec::cap is the allocator returned capacity. I worry that this assumption is implicitly present through a lot of code.

I suggest therefore

  1. Fix RawVecs allocation methods to store the capacity returned from the allocator, unfortunately requiring an extra division to do so
  2. Review the documentation of RawVec and structures that use it to ensure that it is clear that capacity() may return more than what was requested, and associated methods
  3. Looking forward: discuss whether we should guarantee that capacity() returns exactly the allocator returned capacity, and for which datastructures.

The alternative that I see is to instead commit to RawVec::cap being the requested capacity. In which case, it would be good to review where this may have been assumed elsewhere and what might need to be fixed. Happy to hear views on whether people think this would be a better path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-allocatorsArea: Custom and system allocatorsA-collectionsArea: `std::collections`A-sliceArea: `[T]`T-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions