|
| 1 | +When a crate explicitly acknowledges that what it does is UB, but prefers keeping that code over UB-free alternatives (or there are no UB-free alternatives), that is always a concerning sign. |
| 2 | +We should evaluate whether there truly is some use-case here that is not currently achievable in well-defined Rust, and work with crate authors on providing a UB-free solution. |
| 3 | + |
| 4 | +## Known cases of deliberate UB |
| 5 | + |
| 6 | +### Cases related to concurrency |
| 7 | + |
| 8 | +* crossbeam's `AtomicCell` implements an SeqLock, which [is well-known to not be compatible with the C++ memory model](http://www.hpl.hp.com/techreports/2012/HPL-2012-68.pdf). |
| 9 | + Specifically the problem is [this non-atomic volatile read](https://github.com/crossbeam-rs/crossbeam/blob/5d07fe43540d7f21517a51813acd9332744e90cb/crossbeam-utils/src/atomic/atomic_cell.rs#L980) which can cause data races and hence UB. |
| 10 | + This would be fine if we either (a) adopted LLVM's handling of memory races (then the problematic read would merely return `undef` instead of UB due to a data race), or (b) added [bytewise atomic memcpy](https://github.com/rust-lang/rfcs/pull/3301) and used that instead of the non-atomic volatile load. |
| 11 | +* crossbeam's `AtomicCell` also uses the standard library `Atomic*` types to do atomic reads and writes of [*any type* that has the right size](https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-utils/src/atomic/atomic_cell.rs#L928-L932). |
| 12 | + However, doing an `AtomicU32` atomic read on a `(u16, u8)` is unsound because the padding byte can be uninitialized. |
| 13 | + (Also, pointer provenance is lost.) |
| 14 | + To fix this we need to be able to perform atomic loads at type `MaybeUninit<u32>`. |
| 15 | +* Similar to the previous case, the `atomic-memcpy` crate uses the [standard `Atomic*` types to load potential padding or pointer bytes](https://github.com/taiki-e/atomic-memcpy/blob/b7b82bface0f24c5a9171f2d0626871c61302152/src/lib.rs#L323). |
| 16 | + This is a user-space attempt to implement bytewise atomic memcpy, so [adding that as a native operation](https://github.com/rust-lang/rfcs/pull/3301) should fix this. |
| 17 | +* [`bytes`](https://github.com/tokio-rs/bytes/blob/dea868a4b0eec28877e9013702c0ae12dbc40c4b/src/bytes.rs#L2508) does a non-atomic plain load that races, |
| 18 | + because [relaxed loads cost too much performance](https://bugs.llvm.org/show_bug.cgi?id=37716). |
| 19 | + (Note that LLVM's handling of data races is not enough here, data races still return garbage data. Also see [this thread](https://internals.rust-lang.org/t/unordered-as-a-solution-to-bit-wise-reasoning-for-atomic-accesses/11079) on using "unordered".) |
| 20 | + The best fix is probably to improve LLVM optimizations on relaxed loads, so that they are not considered too slow for this use-case. |
| 21 | +* `smol-rs/event-listener` needs an SC fence, but on x86 they prefer a `lock cmpxchg` over an `mfence`. |
| 22 | + To achieve this, they do an [SC `compare_exchange` on a local variable](https://github.com/smol-rs/event-listener/blob/0ea464102e74219aab2932f9eff14418a13268d4/src/notify.rs#L574-L577). |
| 23 | + That is UB since such an operation has no synchronization effect; the compiler could easily see that this variable is never accessed from outside this function and hence optimize it away entirely. |
| 24 | + The best fix is probably an inline assembly block. |
| 25 | + |
| 26 | +### Cases related to aliasing |
| 27 | + |
| 28 | +* `yoke` and similar crates relying in "stable deref" properties cause various forms of aliasing trouble (such as [having `Box` that alias with things](https://github.com/unicode-org/icu4x/issues/2095), or [having references in function arguments that get deallocated while the function runs](https://github.com/unicode-org/icu4x/issues/3696)). |
| 29 | + This could be fixed by [`MaybeDangling`](https://github.com/rust-lang/rfcs/pull/3336). |
| 30 | +* The entire `async fn` ecosystem and every hand-implemented self-referential generator or future is unsound since the self-reference aliases the `&mut` reference to the full generator/future. |
| 31 | + This is currently hackfixed by making `Unpin` meaningful for UB; a proper solution would be to add something like [`UnsafeAliased`](https://github.com/rust-lang/rfcs/pull/3467). |
| 32 | + |
| 33 | +## Former cases of deliberate UB that have at least a work-in-progress solution to them |
| 34 | + |
| 35 | +* Various `offset_of` implementations caused UB by using `mem::uninitialized()`, or they used `&(*base).field` or `addr_of!((*base).field)` to project a dummy pointer to the field which is UB due to out-of-bounds pointer arithmetic. |
| 36 | + The `memoffset` crate has a sound implementation that however causes stack allocations which the compiler must optimize away. |
| 37 | + This will be fixed properly by the native [`offset_of!` macro](https://github.com/rust-lang/rfcs/pull/3308), which is [currently in nightly](https://github.com/rust-lang/rust/issues/106655). |
| 38 | +* It used to be common to unwind out of `extern "C"` functions which is UB, see [this discussions](https://internals.rust-lang.org/t/unwinding-through-ffi-after-rust-1-33/9521). |
| 39 | + This is fixed by `extern "C-unwind"`, which is stable since Rust 1.71. |
0 commit comments