Skip to content

## Pre-Pre-RFC: core::arch::{load, store} and stricter volatile semantics #321

Open
@DemiMarie

Description

@DemiMarie

core::arch::{load, store}

This proposes new load and store functions (in core::arch), for raw hardware loads and stores, and a concept of an always-valid type that can safely be cast to a byte array. It also defines volatile accesses in terms of these functions.

The functions proposed here have the same semantics as raw machine load and store instructions. The compiler is not permitted to assume that the values loaded or stored are initialized, or even that they point to valid memory. However, it is permitted to assume that load and store do not violate Rust’s mutability rules.

In particular, it is valid to use these functions to manipulate memory that is being concurrently accessed or modified by any means whatsoever. Therefore, they can be used to access memory that is shared with untrusted code. For example, a kernel could use them to access userspace memory, and a user-mode server could use them to access memory shared with a less-privileged user-mode process. It is also safe to use these functions to manipulate memory that is being concurrently accessed via DMA, or that corresponds to a memory-mapped hardware register.

The core guarantee that makes load and store useful is this: A call to load or store is guaranteed to result in exactly one non-tearing non-interlocked load from or store to the exact address passed to the function, no matter what that address happens to be. To ensure this, load and store are considered partially opaque to the optimizer. The optimizer must consider them to be calls to functions that may or may not dereference their arguments. It is even possible that the operation triggers a hardware fault that some other code catches and recovers from. Hence, the compiler can never prove that a given call to core::arch::load and core::arch::store will have undefined behavior. In other ways, a call to load or store does not disable any optimizations that a call to an unknown function with the same argument would not also disable. In short: garbage in, garbage out.

The actual functions are as follows:

unsafe fn load<T>(ptr: *const T) -> T;
unsafe fn store<T>(ptr: *mut T, arg: T);

Performs a single memory access (of size size_of::<T>()) on ptr. The compiler must compile each these function calls into exactly one machine instruction. If this is not possible, it is a compile-time error. The types T for which a compiler can successfully generate code for these calls is dependent on the target architecture. Using a T that cannot safely be transmuted to or from a byte array is not forbidden, but is often erroneous, and thus triggers a lint (see below). Provided that ptr is properly aligned, these functions are guaranteed to not cause tearing. If ptr is not properly aligned, the results are architecture-dependent.

The optimizer is not permitted to assume that ptr is dereferenceable or that it is properly aligned. This allows these functions to be used for in-process debuggers, crash dumpers, and other applications that may need to access memory at addresses obtained from some external source, such as a debug console or /proc/self/maps. If load is used to violate the aliasing rules (by accessing memory the compiler thinks cannot be accessed), the value returned may be non-deterministic and may contain sensitive data. If store is used to overwrite memory the compiler can assume will not be modified, subsequent execution (after the call to store returns) has undefined behavior.

The semantics of volatile

A call to ptr::read_volatile desugars to one or more calls to load, and a call to ptr::write_volatile desugars to one or more calls to store. The compiler is required to minimize tearing to the extent possible, provided that doing so does not require the use of interlocked or otherwise synchronized instructions. const fn core::arch::volatile_non_tearing::<T>() -> bool returns true if T is such that tearing cannot occur for naturally-aligned accesses. It may still occur for non-aligned accesses (see below).

Unaligned volatile access

The compiler is not allowed to assume that the arguments of core::{ptr::{read_volatile, write_volatile}, arch::{load, store}} are aligned. However, it is also not required to generate code to handle unaligned access, if doing so would cause a performance penalty for the aligned case. In particular, whether the no-tearing guarantee applies to unaligned access is architecture dependent. On some architectures, it is even possible for unaligned access to cause a hardware trap.

New lints

Use of core::ptr::{read_volatile, write_volatile} with a type that cannot be safely transmuted to and from a byte slice will trigger a dubious_type_in_volatile lint. Use of core::arch::{load, store} with such types will trigger a dubious_type_in_load_or_store lint. Both are Warn by default. Thanks to @comex for the suggestion!

Lowering

LLVM volatile semantics are still unclear, and may turn out to be weaker than necessary. It is also possible that LLVM volatile requires dereferenceable or otherwise interacts poorly with some of the permitted corner-cases. Therefore, I recommend lowering core::{arch::{load, store}, ptr::{read_volatile, write_volatile}} to LLVM inline assembly instead, which is at least guaranteed to work. This may change in the future.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions