Skip to content

ACP: PtrRange<T> type #423

Closed as not planned
Closed as not planned
@clarfonthey

Description

@clarfonthey

Proposal

Problem statement

Pointer ranges are a very useful way of working with slices, and they're currently how the standard library implements many of the slice iterator types.

However, these ranges have a problem that is only made more clear with the move toward strict provenance: there is no guarantee that the two pointers in the range are actually related to each other, or that they refer to the same buffer. While there are some efforts that have been made to rectify this within the standard library itself, it would be nice to have a dedicated pointer range type whose internals can be modified based upon whatever is most easily optimisable by the compiler.

Motivating examples or use cases

  • Pointer ranges allow efficiently indexing from both the start and the end of a slice, which is what makes them ideal for implementing iterators.
  • Code that uses this type can be certain that the two pointers are related and share the same provenance, instead of making this a precondition for every function that uses them.
  • The optimisations from refining this type can not only be reaped by the standard library, but the entire Rust ecosystem

Solution sketch

The precedent from NonNull implies that the pointer range type should serves the purpose of both *const T and *mut T ranges. Additionally, NonNull should be implied, since Rust requires that allocations be non-null; this would enable niche optimisation for things like Option<PtrRange<T>>.

The internal layout for this type can initially just be the naïve:

struct PtrRange<T> {
    start: NonNull<T>,
    end: *const T,
}

(Note: The use of *const T for the second argument preserves covariance while also allowing working as you'd expect for ZSTs, which use the offset to encode the length of the range.)

Although further refinements can be made to, for example, make end just a usize whose provenance gets taken from start. To allow these kinds of optimisations, these fields should not be made public, and instead have methods:

impl<T> PtrRange<T> {
    // SAFETY: User must guarantee pointers have the same provenance, which also implicitly guarantees they are non-null. They should also be a multiple of `size_of::<T>()` apart.
    unsafe fn from_ptr_range(range: Range<*const T>) -> PtrRange<T>;
    unsafe fn from_mut_ptr_range(range: Range<*mut T>) -> PtrRange<T>;
    unsafe fn from_non_null_range(range: Range<NonNull<T>>) -> PtrRange<T>;

    fn start(self) -> NonNull<T>;
    fn end(self) -> NonNull<T>;
    fn start_ptr(self) -> *const T;
    fn end_ptr(self) -> *const T;
    fn start_mut_ptr(self) -> *mut T;
    fn end_mut_ptr(self) -> *mut T;
}

Presumably, there could be a few methods on these that are similar to slices, with the addition of back-indexing.

impl<T> PtrRange<T> {
    fn len(self) -> usize;
    fn is_empty(self) -> bool;

    fn get(self, idx: usize) -> Option<NonNull<T>>;
    fn get_back(self, idx: usize) -> Option<NonNull<T>>;

    unsafe fn get_unchecked(self, idx: usize) -> NonNull<T>;
    unsafe fn get_unchecked_back(self, idx: usize) -> NonNull<T>;

    fn index(self, idx: usize) -> NonNull<T>;
    fn index_back(self, idx: usize) -> NonNull<T>;
}

Presumably, over time, more methods could be added. I don't want to prescribe too much at this stage since I'm not sure what people would use them for.

Alternatives

There are a few options for this:

  1. Make the start and end fields public. This feels like a bad option: if we're going to optimise the provenance here, then why do so in a struct which is clearly identical to Range? We could just optimise Range instead. Plus, this would also require something like unsafe fields, since modifying the fields is unsafe.
  2. Optimise Range, somehow. This would probably involve adding some weird trickery to the compiler to allow relating the provenance of fields on one struct, and keeping that tagged throughout the entire flow of the program. We've done similar things with the borrow checker, but that feels like a lot of work and unlikely to get done any time soon.
  3. Don't optimise Range, but add methods on it similar to the slice operations. We already have len and is_empty for them (technically, via ExactSizeIterator), so we're already partially there.
  4. Do nothing.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions