Description
core::arch::{load, store}
This proposes new load
and store
functions (in core::arch
), for raw hardware loads and stores, and a concept of an always-valid type that can safely be cast to a byte array. It also defines volatile accesses in terms of these functions.
The functions proposed here have the same semantics as raw machine load and store instructions. The compiler is not permitted to assume that the values loaded or stored are initialized, or even that they point to valid memory. However, it is permitted to assume that load
and store
do not violate Rust’s mutability rules.
In particular, it is valid to use these functions to manipulate memory that is being concurrently accessed or modified by any means whatsoever. Therefore, they can be used to access memory that is shared with untrusted code. For example, a kernel could use them to access userspace memory, and a user-mode server could use them to access memory shared with a less-privileged user-mode process. It is also safe to use these functions to manipulate memory that is being concurrently accessed via DMA, or that corresponds to a memory-mapped hardware register.
The core guarantee that makes load
and store
useful is this: A call to load
or store
is guaranteed to result in exactly one non-tearing non-interlocked load from or store to the exact address passed to the function, no matter what that address happens to be. To ensure this, load
and store
are considered partially opaque to the optimizer. The optimizer must consider them to be calls to functions that may or may not dereference their arguments. It is even possible that the operation triggers a hardware fault that some other code catches and recovers from. Hence, the compiler can never prove that a given call to core::arch::load
and core::arch::store
will have undefined behavior. In other ways, a call to load
or store
does not disable any optimizations that a call to an unknown function with the same argument would not also disable. In short: garbage in, garbage out.
The actual functions are as follows:
unsafe fn load<T>(ptr: *const T) -> T;
unsafe fn store<T>(ptr: *mut T, arg: T);
Performs a single memory access (of size size_of::<T>()
) on ptr
. The compiler must compile each these function calls into exactly one machine instruction. If this is not possible, it is a compile-time error. The types T
for which a compiler can successfully generate code for these calls is dependent on the target architecture. Using a T
that cannot safely be transmuted to or from a byte array is not forbidden, but is often erroneous, and thus triggers a lint (see below). Provided that ptr
is properly aligned, these functions are guaranteed to not cause tearing. If ptr
is not properly aligned, the results are architecture-dependent.
The optimizer is not permitted to assume that ptr
is dereferenceable or that it is properly aligned. This allows these functions to be used for in-process debuggers, crash dumpers, and other applications that may need to access memory at addresses obtained from some external source, such as a debug console or /proc/self/maps
. If load
is used to violate the aliasing rules (by accessing memory the compiler thinks cannot be accessed), the value returned may be non-deterministic and may contain sensitive data. If store
is used to overwrite memory the compiler can assume will not be modified, subsequent execution (after the call to store
returns) has undefined behavior.
The semantics of volatile
A call to ptr::read_volatile
desugars to one or more calls to load
, and a call to ptr::write_volatile
desugars to one or more calls to store
. The compiler is required to minimize tearing to the extent possible, provided that doing so does not require the use of interlocked or otherwise synchronized instructions. const fn core::arch::volatile_non_tearing::<T>() -> bool
returns true
if T
is such that tearing cannot occur for naturally-aligned accesses. It may still occur for non-aligned accesses (see below).
Unaligned volatile access
The compiler is not allowed to assume that the arguments of core::{ptr::{read_volatile, write_volatile}, arch::{load, store}}
are aligned. However, it is also not required to generate code to handle unaligned access, if doing so would cause a performance penalty for the aligned case. In particular, whether the no-tearing guarantee applies to unaligned access is architecture dependent. On some architectures, it is even possible for unaligned access to cause a hardware trap.
New lints
Use of core::ptr::{read_volatile, write_volatile}
with a type that cannot be safely transmuted to and from a byte slice will trigger a dubious_type_in_volatile
lint. Use of core::arch::{load, store}
with such types will trigger a dubious_type_in_load_or_store
lint. Both are Warn
by default. Thanks to @comex for the suggestion!
Lowering
LLVM volatile semantics are still unclear, and may turn out to be weaker than necessary. It is also possible that LLVM volatile requires dereferenceable
or otherwise interacts poorly with some of the permitted corner-cases. Therefore, I recommend lowering core::{arch::{load, store}, ptr::{read_volatile, write_volatile}}
to LLVM inline assembly instead, which is at least guaranteed to work. This may change in the future.