Skip to content

Commit a83f8e0

Browse files
authored
[libc++][hardening] Check bounds on arithmetic in __bounded_iter (#78876)
Previously, `__bounded_iter` only checked `operator*`. It allowed the pointer to go out of bounds with `operator++`, etc., and relied on `operator*` (which checked `begin <= current < end`) to handle everything. This has several unfortunate consequences: First, pointer arithmetic is UB if it goes out of bounds. So by the time `operator*` checks, it may be too late and the optimizer may have done something bad. Checking both operations is safer. Second, `std::copy` and friends currently bypass bounded iterator checks. I think the only hope we have to fix this is to key on `iter + n` doing a check. See #78771 for further discussion. Note this PR is not sufficient to fix this. It adds the output bounds check, but ends up doing it after the `memmove`, which is too late. Finally, doing these checks is actually *more* optimizable. See #78829, which is fixed by this PR. Keeping the iterator always in bounds means `operator*` can rely on some invariants and only needs to check `current != end`. This aligns better with common iterator patterns, which use `!=` instead of `<`, so it's easier to delete checks with local reasoning. See https://godbolt.org/z/vEWrWEf8h for how this new `__bounded_iter` impacts compiler output. The old `__bounded_iter` injected checks inside the loops for all the `sum()` functions, which not only added a check inside a loop, but also impeded Clang's vectorization. The new `__bounded_iter` allows all the checks to be optimized out and we emit the same code as if it wasn't here. Not everything is ideal however. `add_and_deref` ends up emitting two comparisons now instead of one. This is because a missed optimization in Clang. I've filed #78875 for that. I suspect (with no data) that this PR is still a net performance win because impeding ranged-for loops is particularly egregious. But ideally we'd fix the optimizer and make `add_and_deref` fine too. There's also something funny going on with `std::ranges::find` which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities. Fixes #78829. (CC @danakj)
1 parent e4a5467 commit a83f8e0

File tree

6 files changed

+385
-221
lines changed

6 files changed

+385
-221
lines changed

libcxx/include/__iterator/bounded_iter.h

Lines changed: 46 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,20 @@ _LIBCPP_BEGIN_NAMESPACE_STD
3131
// Iterator wrapper that carries the valid range it is allowed to access.
3232
//
3333
// This is a simple iterator wrapper for contiguous iterators that points
34-
// within a [begin, end) range and carries these bounds with it. The iterator
35-
// ensures that it is pointing within that [begin, end) range when it is
36-
// dereferenced.
34+
// within a [begin, end] range and carries these bounds with it. The iterator
35+
// ensures that it is pointing within [begin, end) range when it is
36+
// dereferenced. It also ensures that it is never iterated outside of
37+
// [begin, end]. This is important for two reasons:
3738
//
38-
// Arithmetic operations are allowed and the bounds of the resulting iterator
39-
// are not checked. Hence, it is possible to create an iterator pointing outside
40-
// its range, but it is not possible to dereference it.
39+
// 1. It allows `operator*` and `operator++` bounds checks to be `iter != end`.
40+
// This is both less for the optimizer to prove, and aligns with how callers
41+
// typically use iterators.
42+
//
43+
// 2. Advancing an iterator out of bounds is undefined behavior (see the table
44+
// in [input.iterators]). In particular, when the underlying iterator is a
45+
// pointer, it is undefined at the language level (see [expr.add]). If
46+
// bounded iterators exhibited this undefined behavior, we risk compiler
47+
// optimizations deleting non-redundant bounds checks.
4148
template <class _Iterator, class = __enable_if_t< __libcpp_is_contiguous_iterator<_Iterator>::value > >
4249
struct __bounded_iter {
4350
using value_type = typename iterator_traits<_Iterator>::value_type;
@@ -51,8 +58,8 @@ struct __bounded_iter {
5158

5259
// Create a singular iterator.
5360
//
54-
// Such an iterator does not point to any object and is conceptually out of bounds, so it is
55-
// not dereferenceable. Observing operations like comparison and assignment are valid.
61+
// Such an iterator points past the end of an empty span, so it is not dereferenceable.
62+
// Observing operations like comparison and assignment are valid.
5663
_LIBCPP_HIDE_FROM_ABI __bounded_iter() = default;
5764

5865
_LIBCPP_HIDE_FROM_ABI __bounded_iter(__bounded_iter const&) = default;
@@ -70,18 +77,20 @@ struct __bounded_iter {
7077

7178
private:
7279
// Create an iterator wrapping the given iterator, and whose bounds are described
73-
// by the provided [begin, end) range.
80+
// by the provided [begin, end] range.
7481
//
75-
// This constructor does not check whether the resulting iterator is within its bounds.
76-
// However, it does check that the provided [begin, end) range is a valid range (that
77-
// is, begin <= end).
82+
// The constructor does not check whether the resulting iterator is within its bounds. It is a
83+
// responsibility of the container to ensure that the given bounds are valid.
7884
//
7985
// Since it is non-standard for iterators to have this constructor, __bounded_iter must
8086
// be created via `std::__make_bounded_iter`.
8187
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 explicit __bounded_iter(
8288
_Iterator __current, _Iterator __begin, _Iterator __end)
8389
: __current_(__current), __begin_(__begin), __end_(__end) {
84-
_LIBCPP_ASSERT_INTERNAL(__begin <= __end, "__bounded_iter(current, begin, end): [begin, end) is not a valid range");
90+
_LIBCPP_ASSERT_INTERNAL(
91+
__begin <= __current, "__bounded_iter(current, begin, end): current and begin are inconsistent");
92+
_LIBCPP_ASSERT_INTERNAL(
93+
__current <= __end, "__bounded_iter(current, begin, end): current and end are inconsistent");
8594
}
8695

8796
template <class _It>
@@ -90,30 +99,37 @@ struct __bounded_iter {
9099
public:
91100
// Dereference and indexing operations.
92101
//
93-
// These operations check that the iterator is dereferenceable, that is within [begin, end).
102+
// These operations check that the iterator is dereferenceable. Since the class invariant is
103+
// that the iterator is always within `[begin, end]`, we only need to check it's not pointing to
104+
// `end`. This is easier for the optimizer because it aligns with the `iter != container.end()`
105+
// checks that typical callers already use (see
106+
// https://github.com/llvm/llvm-project/issues/78829).
94107
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator*() const _NOEXCEPT {
95108
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
96-
__in_bounds(__current_), "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
109+
__current_ != __end_, "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
97110
return *__current_;
98111
}
99112

100113
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 pointer operator->() const _NOEXCEPT {
101114
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
102-
__in_bounds(__current_), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
115+
__current_ != __end_, "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
103116
return std::__to_address(__current_);
104117
}
105118

106119
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator[](difference_type __n) const _NOEXCEPT {
107120
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
108-
__in_bounds(__current_ + __n), "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
121+
__n >= __begin_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator past the start");
122+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
123+
__n < __end_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator at or past the end");
109124
return __current_[__n];
110125
}
111126

112127
// Arithmetic operations.
113128
//
114-
// These operations do not check that the resulting iterator is within the bounds, since that
115-
// would make it impossible to create a past-the-end iterator.
129+
// These operations check that the iterator remains within `[begin, end]`.
116130
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator++() _NOEXCEPT {
131+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
132+
__current_ != __end_, "__bounded_iter::operator++: Attempt to advance an iterator past the end");
117133
++__current_;
118134
return *this;
119135
}
@@ -124,6 +140,8 @@ struct __bounded_iter {
124140
}
125141

126142
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator--() _NOEXCEPT {
143+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
144+
__current_ != __begin_, "__bounded_iter::operator--: Attempt to rewind an iterator past the start");
127145
--__current_;
128146
return *this;
129147
}
@@ -134,6 +152,10 @@ struct __bounded_iter {
134152
}
135153

136154
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator+=(difference_type __n) _NOEXCEPT {
155+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
156+
__n >= __begin_ - __current_, "__bounded_iter::operator+=: Attempt to rewind an iterator past the start");
157+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
158+
__n <= __end_ - __current_, "__bounded_iter::operator+=: Attempt to advance an iterator past the end");
137159
__current_ += __n;
138160
return *this;
139161
}
@@ -151,6 +173,10 @@ struct __bounded_iter {
151173
}
152174

153175
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator-=(difference_type __n) _NOEXCEPT {
176+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
177+
__n <= __current_ - __begin_, "__bounded_iter::operator-=: Attempt to rewind an iterator past the start");
178+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
179+
__n >= __current_ - __end_, "__bounded_iter::operator-=: Attempt to advance an iterator past the end");
154180
__current_ -= __n;
155181
return *this;
156182
}
@@ -197,15 +223,10 @@ struct __bounded_iter {
197223
}
198224

199225
private:
200-
// Return whether the given iterator is in the bounds of this __bounded_iter.
201-
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __in_bounds(_Iterator const& __iter) const {
202-
return __iter >= __begin_ && __iter < __end_;
203-
}
204-
205226
template <class>
206227
friend struct pointer_traits;
207228
_Iterator __current_; // current iterator
208-
_Iterator __begin_, __end_; // valid range represented as [begin, end)
229+
_Iterator __begin_, __end_; // valid range represented as [begin, end]
209230
};
210231

211232
template <class _It>
Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
// UNSUPPORTED: c++03, c++11, c++14, c++17
9+
10+
// Make sure that std::span's iterators check for OOB accesses when the debug mode is enabled.
11+
12+
// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
13+
// UNSUPPORTED: libcpp-hardening-mode=none
14+
15+
#include <span>
16+
17+
#include "check_assertion.h"
18+
19+
struct Foo {
20+
int x;
21+
};
22+
23+
template <typename Iter>
24+
void test_iterator(Iter begin, Iter end, bool reverse) {
25+
std::ptrdiff_t distance = std::distance(begin, end);
26+
27+
// Dereferencing an iterator at the end.
28+
{
29+
TEST_LIBCPP_ASSERT_FAILURE(
30+
*end,
31+
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
32+
: "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
33+
#if _LIBCPP_STD_VER >= 20
34+
// In C++20 mode, std::reverse_iterator implements operator->, but not operator*, with
35+
// std::prev instead of operator--. std::prev ultimately calls operator+
36+
TEST_LIBCPP_ASSERT_FAILURE(
37+
end->x,
38+
reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
39+
: "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
40+
#else
41+
TEST_LIBCPP_ASSERT_FAILURE(
42+
end->x,
43+
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
44+
: "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
45+
#endif
46+
}
47+
48+
// Incrementing an iterator past the end.
49+
{
50+
[[maybe_unused]] const char* msg =
51+
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
52+
: "__bounded_iter::operator++: Attempt to advance an iterator past the end";
53+
auto it = end;
54+
TEST_LIBCPP_ASSERT_FAILURE(it++, msg);
55+
TEST_LIBCPP_ASSERT_FAILURE(++it, msg);
56+
}
57+
58+
// Decrementing an iterator past the start.
59+
{
60+
[[maybe_unused]] const char* msg =
61+
reverse ? "__bounded_iter::operator++: Attempt to advance an iterator past the end"
62+
: "__bounded_iter::operator--: Attempt to rewind an iterator past the start";
63+
auto it = begin;
64+
TEST_LIBCPP_ASSERT_FAILURE(it--, msg);
65+
TEST_LIBCPP_ASSERT_FAILURE(--it, msg);
66+
}
67+
68+
// Advancing past the end with operator+= and operator+.
69+
{
70+
[[maybe_unused]] const char* msg =
71+
reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
72+
: "__bounded_iter::operator+=: Attempt to advance an iterator past the end";
73+
auto it = end;
74+
TEST_LIBCPP_ASSERT_FAILURE(it += 1, msg);
75+
TEST_LIBCPP_ASSERT_FAILURE(end + 1, msg);
76+
it = begin;
77+
TEST_LIBCPP_ASSERT_FAILURE(it += (distance + 1), msg);
78+
TEST_LIBCPP_ASSERT_FAILURE(begin + (distance + 1), msg);
79+
}
80+
81+
// Advancing past the end with operator-= and operator-.
82+
{
83+
[[maybe_unused]] const char* msg =
84+
reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
85+
: "__bounded_iter::operator-=: Attempt to advance an iterator past the end";
86+
auto it = end;
87+
TEST_LIBCPP_ASSERT_FAILURE(it -= (-1), msg);
88+
TEST_LIBCPP_ASSERT_FAILURE(end - (-1), msg);
89+
it = begin;
90+
TEST_LIBCPP_ASSERT_FAILURE(it -= (-distance - 1), msg);
91+
TEST_LIBCPP_ASSERT_FAILURE(begin - (-distance - 1), msg);
92+
}
93+
94+
// Rewinding past the start with operator+= and operator+.
95+
{
96+
[[maybe_unused]] const char* msg =
97+
reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end"
98+
: "__bounded_iter::operator+=: Attempt to rewind an iterator past the start";
99+
auto it = begin;
100+
TEST_LIBCPP_ASSERT_FAILURE(it += (-1), msg);
101+
TEST_LIBCPP_ASSERT_FAILURE(begin + (-1), msg);
102+
it = end;
103+
TEST_LIBCPP_ASSERT_FAILURE(it += (-distance - 1), msg);
104+
TEST_LIBCPP_ASSERT_FAILURE(end + (-distance - 1), msg);
105+
}
106+
107+
// Rewinding past the start with operator-= and operator-.
108+
{
109+
[[maybe_unused]] const char* msg =
110+
reverse ? "__bounded_iter::operator+=: Attempt to advance an iterator past the end"
111+
: "__bounded_iter::operator-=: Attempt to rewind an iterator past the start";
112+
auto it = begin;
113+
TEST_LIBCPP_ASSERT_FAILURE(it -= 1, msg);
114+
TEST_LIBCPP_ASSERT_FAILURE(begin - 1, msg);
115+
it = end;
116+
TEST_LIBCPP_ASSERT_FAILURE(it -= (distance + 1), msg);
117+
TEST_LIBCPP_ASSERT_FAILURE(end - (distance + 1), msg);
118+
}
119+
120+
// Out-of-bounds operator[].
121+
{
122+
[[maybe_unused]] const char* end_msg =
123+
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
124+
: "__bounded_iter::operator[]: Attempt to index an iterator at or past the end";
125+
[[maybe_unused]] const char* past_end_msg =
126+
reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
127+
: "__bounded_iter::operator[]: Attempt to index an iterator at or past the end";
128+
[[maybe_unused]] const char* past_start_msg =
129+
reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end"
130+
: "__bounded_iter::operator[]: Attempt to index an iterator past the start";
131+
TEST_LIBCPP_ASSERT_FAILURE(begin[distance], end_msg);
132+
TEST_LIBCPP_ASSERT_FAILURE(begin[distance + 1], past_end_msg);
133+
TEST_LIBCPP_ASSERT_FAILURE(begin[-1], past_start_msg);
134+
TEST_LIBCPP_ASSERT_FAILURE(begin[-99], past_start_msg);
135+
136+
auto it = begin + 1;
137+
TEST_LIBCPP_ASSERT_FAILURE(it[distance - 1], end_msg);
138+
TEST_LIBCPP_ASSERT_FAILURE(it[distance], past_end_msg);
139+
TEST_LIBCPP_ASSERT_FAILURE(it[-2], past_start_msg);
140+
TEST_LIBCPP_ASSERT_FAILURE(it[-99], past_start_msg);
141+
}
142+
}
143+
144+
int main(int, char**) {
145+
// span<T>::iterator
146+
{
147+
Foo array[] = {{0}, {1}, {2}};
148+
std::span<Foo> const span(array, 3);
149+
test_iterator(span.begin(), span.end(), /*reverse=*/false);
150+
}
151+
152+
// span<T, N>::iterator
153+
{
154+
Foo array[] = {{0}, {1}, {2}};
155+
std::span<Foo, 3> const span(array, 3);
156+
test_iterator(span.begin(), span.end(), /*reverse=*/false);
157+
}
158+
159+
// span<T>::reverse_iterator
160+
{
161+
Foo array[] = {{0}, {1}, {2}};
162+
std::span<Foo> const span(array, 3);
163+
test_iterator(span.rbegin(), span.rend(), /*reverse=*/true);
164+
}
165+
166+
// span<T, N>::reverse_iterator
167+
{
168+
Foo array[] = {{0}, {1}, {2}};
169+
std::span<Foo, 3> const span(array, 3);
170+
test_iterator(span.rbegin(), span.rend(), /*reverse=*/true);
171+
}
172+
173+
return 0;
174+
}

0 commit comments

Comments
 (0)