You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: resources/deliberate-ub.md
+6-16Lines changed: 6 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -5,28 +5,18 @@ We should evaluate whether there truly is some use-case here that is not current
5
5
6
6
### Cases related to concurrency
7
7
8
-
* crossbeam's `AtomicCell`implements an SeqLock, which [is well-known to not be compatible with the C++ memory model](http://www.hpl.hp.com/techreports/2012/HPL-2012-68.pdf).
8
+
* crossbeam's `AtomicCell`"fallback path" uses a SeqLock, which [is well-known to not be compatible with the C++ memory model](http://www.hpl.hp.com/techreports/2012/HPL-2012-68.pdf).
9
9
Specifically the problem is [this non-atomic volatile read](https://github.com/crossbeam-rs/crossbeam/blob/5d07fe43540d7f21517a51813acd9332744e90cb/crossbeam-utils/src/atomic/atomic_cell.rs#L980) which can cause data races and hence UB.
10
10
This would be fine if we either (a) adopted LLVM's handling of memory races (then the problematic read would merely return `undef` instead of UB due to a data race), or (b) added [bytewise atomic memcpy](https://github.com/rust-lang/rfcs/pull/3301) and used that instead of the non-atomic volatile load.
11
11
This is currently *not* UB in the LLVM IR we generate, due to LLVM's different handling of read-write races and because the code carefully uses `MaybeUninit`.
12
-
* crossbeam's `AtomicCell`also uses the standard library `Atomic*` types to do atomic reads and writes of [*any type* that has the right size](https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-utils/src/atomic/atomic_cell.rs#L928-L932).
13
-
However, doing an `AtomicU32`atomic read on a `(u16, u8)` is unsound because the padding byte can be uninitialized.
14
-
(Also, pointer provenance is lost.)
12
+
* crossbeam's `AtomicCell`"fast path" uses the standard library `Atomic*` types to do atomic reads and writes of [*any type* that has the right size](https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-utils/src/atomic/atomic_cell.rs#L928-L932).
13
+
However, doing an `AtomicU32` read (returning a `u32`) on a `(u16, u8)` is unsound because the padding byte can be uninitialized.
14
+
(Also, pointer provenance is lost, so `AtomicCell<*const T>` does not behave the way people expect.)
15
15
To fix this we need to be able to perform atomic loads at type `MaybeUninit<u32>`.
16
16
The LLVM IR we generate here contains `noundef` so this UB is not just a specification artifact.<br>
17
17
Furthermore, `compare_exchange` will compare padding bytes, which leads to UB.
18
-
It is not clear how to best specify a useful `compare_exchange` that can work on padding bytes.
19
-
*`bytes` used to do [a non-atomic plain load that races](https://github.com/tokio-rs/bytes/blob/dea868a4b0eec28877e9013702c0ae12dbc40c4b/src/bytes.rs#L2508),
20
-
because relaxed loads are claimed to cost too much performance (also see [this LLVM issue](https://github.com/llvm/llvm-project/issues/37064)).
21
-
(Note that LLVM's handling of data races is not enough here, data races still return garbage data -- so this is UB in the generated LLVM IR. Also see [this thread](https://internals.rust-lang.org/t/unordered-as-a-solution-to-bit-wise-reasoning-for-atomic-accesses/11079) on using "unordered".)
22
-
The best fix is probably to improve LLVM optimizations on relaxed loads, so that they are not considered too slow for this use-case.
23
-
However, it is not clear if all the desired optimizations are even legal for relaxed loads -- there is a significant semantic difference between relaxed loads and non-atomic loads, so seeing *some* performance difference is expected.
24
-
Code that works with concurrency can't just expect not to have to pay a cost for that.
25
-
And anyway current `bytes` does not seem to have any atomics in `ByteMut` any more, so possibly this is resolved.
26
-
*`smol-rs/event-listener` needs an SC fence, but on x86 they prefer a `lock cmpxchg` over an `mfence`.
27
-
To achieve this, they do an [SC `compare_exchange` on a local variable](https://github.com/smol-rs/event-listener/blob/0ea464102e74219aab2932f9eff14418a13268d4/src/notify.rs#L574-L577).
28
-
That is UB (both in Rust and LLVM) since such an operation has no synchronization effect; the compiler could easily see that this variable is never accessed from outside this function and hence optimize it away entirely.
29
-
The best fix is probably an [inline assembly block](https://github.com/smol-rs/event-listener/pull/71).
18
+
It is not clear how to best specify a useful `compare_exchange` that can work on padding bytes,
19
+
see the [discussion here](https://github.com/rust-lang/unsafe-code-guidelines/issues/449).
0 commit comments