You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[libc++][hardening] Check bounds on arithmetic in __bounded_iter (#78876)
Previously, `__bounded_iter` only checked `operator*`. It allowed the
pointer to go out of bounds with `operator++`, etc., and relied on
`operator*` (which checked `begin <= current < end`) to handle
everything. This has several unfortunate consequences:
First, pointer arithmetic is UB if it goes out of bounds. So by the time
`operator*` checks, it may be too late and the optimizer may have done
something bad. Checking both operations is safer.
Second, `std::copy` and friends currently bypass bounded iterator
checks. I think the only hope we have to fix this is to key on `iter +
n` doing a check. See #78771 for further discussion. Note this PR is not
sufficient to fix this. It adds the output bounds check, but ends up
doing it after the `memmove`, which is too late.
Finally, doing these checks is actually *more* optimizable. See #78829,
which is fixed by this PR. Keeping the iterator always in bounds means
`operator*` can rely on some invariants and only needs to check `current
!= end`. This aligns better with common iterator patterns, which use
`!=` instead of `<`, so it's easier to delete checks with local
reasoning.
See https://godbolt.org/z/vEWrWEf8h for how this new `__bounded_iter`
impacts compiler output. The old `__bounded_iter` injected checks inside
the loops for all the `sum()` functions, which not only added a check
inside a loop, but also impeded Clang's vectorization. The new
`__bounded_iter` allows all the checks to be optimized out and we emit
the same code as if it wasn't here.
Not everything is ideal however. `add_and_deref` ends up emitting two
comparisons now instead of one. This is because a missed optimization in
Clang. I've filed #78875 for that. I suspect (with no data) that this PR
is still a net performance win because impeding ranged-for loops is
particularly egregious. But ideally we'd fix the optimizer and make
`add_and_deref` fine too.
There's also something funny going on with `std::ranges::find` which I
have not yet figured out yet, but I suspect there are some further
missed optimization opportunities.
Fixes#78829.
(CC @danakj)
0 commit comments