You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: resources/deliberate-ub.md
+3-2Lines changed: 3 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -16,16 +16,17 @@ We should evaluate whether there truly is some use-case here that is not current
16
16
The LLVM IR we generate here contains `noundef` so this UB is not just a specification artifact.<br>
17
17
Furthermore, `compare_exchange` will compare padding bytes, which leads to UB.
18
18
It is not clear how to best specify a useful `compare_exchange` that can work on padding bytes.
19
-
*`bytes`does[a non-atomic plain load that races](https://github.com/tokio-rs/bytes/blob/dea868a4b0eec28877e9013702c0ae12dbc40c4b/src/bytes.rs#L2508),
19
+
*`bytes`used to do[a non-atomic plain load that races](https://github.com/tokio-rs/bytes/blob/dea868a4b0eec28877e9013702c0ae12dbc40c4b/src/bytes.rs#L2508),
20
20
because relaxed loads are claimed to cost too much performance (also see [this LLVM issue](https://github.com/llvm/llvm-project/issues/37064)).
21
21
(Note that LLVM's handling of data races is not enough here, data races still return garbage data -- so this is UB in the generated LLVM IR. Also see [this thread](https://internals.rust-lang.org/t/unordered-as-a-solution-to-bit-wise-reasoning-for-atomic-accesses/11079) on using "unordered".)
22
22
The best fix is probably to improve LLVM optimizations on relaxed loads, so that they are not considered too slow for this use-case.
23
23
However, it is not clear if all the desired optimizations are even legal for relaxed loads -- there is a significant semantic difference between relaxed loads and non-atomic loads, so seeing *some* performance difference is expected.
24
24
Code that works with concurrency can't just expect not to have to pay a cost for that.
25
+
And anyway current `bytes` does not seem to have any atomics in `ByteMut` any more, so possibly this is resolved.
25
26
*`smol-rs/event-listener` needs an SC fence, but on x86 they prefer a `lock cmpxchg` over an `mfence`.
26
27
To achieve this, they do an [SC `compare_exchange` on a local variable](https://github.com/smol-rs/event-listener/blob/0ea464102e74219aab2932f9eff14418a13268d4/src/notify.rs#L574-L577).
27
28
That is UB (both in Rust and LLVM) since such an operation has no synchronization effect; the compiler could easily see that this variable is never accessed from outside this function and hence optimize it away entirely.
28
-
The best fix is probably an inline assembly block.
29
+
The best fix is probably an [inline assembly block](https://github.com/smol-rs/event-listener/pull/71).
0 commit comments