Description
From the docs:
Safety
Behavior is undefined if any of the following conditions are violated:
src
must be valid for reads ofcount * size_of::<T>()>
bytes, and must remain valid even whendst
is written forcount * size_of::<T>()
bytes. (This means if the memory ranges overlap, the two pointers must not be subject to aliasing restrictions relative to each other.)dst
must be valid for writes ofcount * size_of::<T>()>
bytes, and must remain valid even whensrc
is read forcount * size_of::<T>()
bytes.- Both
src
anddst
must be properly aligned.
I read this first requirement of saying that after writes to dst
, reading from src
must still be possible, even in the positions that have been written to… it says “when dst
is written for count * size_of::<T>()
bytes”, which should mean “when the whole range of dst
has [possibly] been written to”, right?
Of course, the docs also say:
Copying takes place as if the bytes were copied from
src
to a temporary array and then copied from the array todst
.
Which when taken literally could be interpreted to say - in particular - that each place is only ever written to via dst
after it has been read via src
. But I - as a reader - would actually understand this particular sentence to be only about the question of "which bytes are in what places at the end of this" (because I guess otherwise it's also reasonable to assume that a copy
operation might only 'technically' allow overlap to reduce UB, but still mix up the order of the data/elements in that case).
As far as what I can (currently) observe in miri, the documented precondition is not checked.
fn f1() {
let mut_data: *mut [u8] = &mut [0_u8; 100];
let borrow: &[u8] = unsafe { &(*mut_data)[..] };
let const_data: *const [u8] = borrow;
let (mut_data_ptr, const_data_ptr) = (mut_data as *mut u8, const_data as *const u8);
// now, writes to `mut_data_ptr` CAN make `const_data_ptr` invalid for reads:
unsafe {
mut_data_ptr.add(42).write(1);
const_data_ptr.add(32).read(); // different position is okay,
const_data_ptr.add(42).read(); // same position: miri says UB
}
}
fn f2() {
let mut_data: *mut [u8] = &mut [0_u8; 100];
let borrow: &[u8] = unsafe { &(*mut_data)[..] };
let const_data: *const [u8] = borrow;
let (mut_data_ptr, const_data_ptr) = (mut_data as *mut u8, const_data as *const u8);
// but if we call `ptr::copy`, there's no issues?
unsafe {
std::ptr::copy(const_data_ptr.add(10), mut_data_ptr, 30);
std::ptr::copy(const_data_ptr.add(40), mut_data_ptr.add(50), 30);
}
// Note that `ptr::copy` documents the following:
// > `src` must be valid for reads of `count * size_of::<T>()` bytes,
// > and must remain valid even when `dst` is written
// > for `count * size_of::<T>()` bytes
}
fn main() {
f2();
// f1();
}
So, either the precondition is wrong, or miri is wrong, or I'm misinterpreting something, or the behavior isn't fully defined yet and std docs make conservative restrictions for future compat, whilst miri conservatively allows stuff that isn't "actual" UB today.
Perhaps someone here knows more context?
AFAICT, given that the copy
itself actually accesses all the memory/bytes in question anyways, this can't be a case where miri
is simply unable to enforce the documented precondition efficiently. [Unlike many cases of documented library-UB for instance.]