remove resolve cases of deliberate UB: bytes, smol-rs/event-listener

RalfJung · RalfJung · commit 2782bcaead0a · 2023-08-19T23:12:02.000+02:00
diff --git a/resources/deliberate-ub.md b/resources/deliberate-ub.md
@@ -5,28 +5,18 @@ We should evaluate whether there truly is some use-case here that is not current
 
 ### Cases related to concurrency
 
-* crossbeam's `AtomicCell` implements an SeqLock, which [is well-known to not be compatible with the C++ memory model](http://www.hpl.hp.com/techreports/2012/HPL-2012-68.pdf).
+* crossbeam's `AtomicCell` "fallback path" uses a SeqLock, which [is well-known to not be compatible with the C++ memory model](http://www.hpl.hp.com/techreports/2012/HPL-2012-68.pdf).
   Specifically the problem is [this non-atomic volatile read](https://github.com/crossbeam-rs/crossbeam/blob/5d07fe43540d7f21517a51813acd9332744e90cb/crossbeam-utils/src/atomic/atomic_cell.rs#L980) which can cause data races and hence UB.
   This would be fine if we either (a) adopted LLVM's handling of memory races (then the problematic read would merely return `undef` instead of UB due to a data race), or (b) added [bytewise atomic memcpy](https://github.com/rust-lang/rfcs/pull/3301) and used that instead of the non-atomic volatile load.
   This is currently *not* UB in the LLVM IR we generate, due to LLVM's different handling of read-write races and because the code carefully uses `MaybeUninit`.
-* crossbeam's `AtomicCell` also uses the standard library `Atomic*` types to do atomic reads and writes of [*any type* that has the right size](https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-utils/src/atomic/atomic_cell.rs#L928-L932).
-  However, doing an `AtomicU32` atomic read on a `(u16, u8)` is unsound because the padding byte can be uninitialized.
-  (Also, pointer provenance is lost.)
+* crossbeam's `AtomicCell` "fast path" uses the standard library `Atomic*` types to do atomic reads and writes of [*any type* that has the right size](https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-utils/src/atomic/atomic_cell.rs#L928-L932).
+  However, doing an `AtomicU32` read (returning a `u32`) on a `(u16, u8)` is unsound because the padding byte can be uninitialized.
+  (Also, pointer provenance is lost, so `AtomicCell<*const T>` does not behave the way people expect.)
   To fix this we need to be able to perform atomic loads at type `MaybeUninit<u32>`.
   The LLVM IR we generate here contains `noundef` so this UB is not just a specification artifact.<br>
   Furthermore, `compare_exchange` will compare padding bytes, which leads to UB.
-  It is not clear how to best specify a useful `compare_exchange` that can work on padding bytes.
-* `bytes` used to do [a non-atomic plain load that races](https://github.com/tokio-rs/bytes/blob/dea868a4b0eec28877e9013702c0ae12dbc40c4b/src/bytes.rs#L2508),
-  because relaxed loads are claimed to cost too much performance (also see [this LLVM issue](https://github.com/llvm/llvm-project/issues/37064)).
-  (Note that LLVM's handling of data races is not enough here, data races still return garbage data -- so this is UB in the generated LLVM IR. Also see [this thread](https://internals.rust-lang.org/t/unordered-as-a-solution-to-bit-wise-reasoning-for-atomic-accesses/11079) on using "unordered".)
-  The best fix is probably to improve LLVM optimizations on relaxed loads, so that they are not considered too slow for this use-case.
-  However, it is not clear if all the desired optimizations are even legal for relaxed loads -- there is a significant semantic difference between relaxed loads and non-atomic loads, so seeing *some* performance difference is expected.
-  Code that works with concurrency can't just expect not to have to pay a cost for that.
-  And anyway current `bytes` does not seem to have any atomics in `ByteMut` any more, so possibly this is resolved.
-* `smol-rs/event-listener` needs an SC fence, but on x86 they prefer a `lock cmpxchg` over an `mfence`.
-  To achieve this, they do an [SC `compare_exchange` on a local variable](https://github.com/smol-rs/event-listener/blob/0ea464102e74219aab2932f9eff14418a13268d4/src/notify.rs#L574-L577).
-  That is UB (both in Rust and LLVM) since such an operation has no synchronization effect; the compiler could easily see that this variable is never accessed from outside this function and hence optimize it away entirely.
-  The best fix is probably an [inline assembly block](https://github.com/smol-rs/event-listener/pull/71).
+  It is not clear how to best specify a useful `compare_exchange` that can work on padding bytes,
+  see the [discussion here](https://github.com/rust-lang/unsafe-code-guidelines/issues/449).
 
 ### Cases related to aliasing