-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[libc++][hardening] Check bounds on arithmetic in __bounded_iter #78876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-libcxx Author: David Benjamin (davidben) ChangesPreviously, __bounded_iter only checked operator*. It allowed the pointer to go out of bounds with operator++, etc., and relied on operator* (which checked begin <= current < end) to handle everything. This has several unfortunate consequences: First, pointer arithmetic is UB if it goes out of bounds. So by the time operator* checks, it may be too late and the optimizer may have done something bad. Checking both operations is safer. Second, std::copy and friends currently bypass bounded iterator checks. I think the only hope we have to fix this is to key on iter + n doing a check. See #78771 for further discussion. Note this PR is not sufficient to fix this. It adds the output bounds check, but ends up doing it after the memmove, which is too late. Finally, doing these checks is actually more optimizable. See #78829, which is fixed by this PR. Keeping the iterator always in bounds means operator* can rely on some invariants and only needs to check current != end. This aligns better with the ways that callers tend to condition iterator operations. At least for the common case of iterating over the entire See https://godbolt.org/z/vEWrWEf8h for how this new __bounded_iter impacts compiler output. The old __bounded_iter injected checks inside the loops for all the sum() functions, which not only added a check inside a loop, but also impeded Clang's vectorization. The new __bounded_iter allows all the checks to be optimized out and we emit the same code as if it wasn't here. Not everything is ideal however. add_and_deref ends up emitting two comparisons now instead of one. This is because a missed optimization in Clang. I've filed #78875 for that. I suspect (with no data) that this PR is still a net performance win because impeding ranged-for loops is particularly egregious. But ideally we'd fix the optimizer and make add_and_deref fine too. There's also something funny going on with std::ranges::find which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities. Fixes #78829. (CC @danakj) Patch is 24.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78876.diff 3 Files Affected:
diff --git a/libcxx/include/__iterator/bounded_iter.h b/libcxx/include/__iterator/bounded_iter.h
index 2a667648871c4c..3ea20734ac66cb 100644
--- a/libcxx/include/__iterator/bounded_iter.h
+++ b/libcxx/include/__iterator/bounded_iter.h
@@ -31,13 +31,10 @@ _LIBCPP_BEGIN_NAMESPACE_STD
// Iterator wrapper that carries the valid range it is allowed to access.
//
// This is a simple iterator wrapper for contiguous iterators that points
-// within a [begin, end) range and carries these bounds with it. The iterator
-// ensures that it is pointing within that [begin, end) range when it is
-// dereferenced.
-//
-// Arithmetic operations are allowed and the bounds of the resulting iterator
-// are not checked. Hence, it is possible to create an iterator pointing outside
-// its range, but it is not possible to dereference it.
+// within a [begin, end] range and carries these bounds with it. The iterator
+// ensures that it is pointing within [begin, end) range when it is
+// dereferenced. It also ensures that it is never iterated outside of
+// [begin, end].
template <class _Iterator, class = __enable_if_t< __libcpp_is_contiguous_iterator<_Iterator>::value > >
struct __bounded_iter {
using value_type = typename iterator_traits<_Iterator>::value_type;
@@ -70,18 +67,18 @@ struct __bounded_iter {
private:
// Create an iterator wrapping the given iterator, and whose bounds are described
- // by the provided [begin, end) range.
+ // by the provided [begin, end] range.
//
- // This constructor does not check whether the resulting iterator is within its bounds.
- // However, it does check that the provided [begin, end) range is a valid range (that
- // is, begin <= end).
+ // Except in debug builds, the constructor does not check whether the resulting iterator
+ // is within its bounds. The container is assumed to be self-consistent.
//
// Since it is non-standard for iterators to have this constructor, __bounded_iter must
// be created via `std::__make_bounded_iter`.
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 explicit __bounded_iter(
_Iterator __current, _Iterator __begin, _Iterator __end)
: __current_(__current), __begin_(__begin), __end_(__end) {
- _LIBCPP_ASSERT_INTERNAL(__begin <= __end, "__bounded_iter(current, begin, end): [begin, end) is not a valid range");
+ _LIBCPP_ASSERT_INTERNAL(__begin <= __current, "__bounded_iter(current, begin, end): current and begin are inconsistent");
+ _LIBCPP_ASSERT_INTERNAL(__current <= __end, "__bounded_iter(current, begin, end): current and end are inconsistent");
}
template <class _It>
@@ -91,21 +88,25 @@ struct __bounded_iter {
// Dereference and indexing operations.
//
// These operations check that the iterator is dereferenceable, that is within [begin, end).
+ // The iterator is always within [begin, end], so we only need to check for end. This is
+ // easier to optimize out. See https://github.com/llvm/llvm-project/issues/78829
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator*() const _NOEXCEPT {
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
- __in_bounds(__current_), "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
+ __current_ != __end_, "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
return *__current_;
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 pointer operator->() const _NOEXCEPT {
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
- __in_bounds(__current_), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
+ __current_ != __end_, "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
return std::__to_address(__current_);
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator[](difference_type __n) const _NOEXCEPT {
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
- __in_bounds(__current_ + __n), "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
+ __n >= __begin_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator past the start");
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n < __end_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator past the end");
return __current_[__n];
}
@@ -114,6 +115,8 @@ struct __bounded_iter {
// These operations do not check that the resulting iterator is within the bounds, since that
// would make it impossible to create a past-the-end iterator.
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator++() _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __current_ != __end_, "__bounded_iter::operator++: Attempt to advance an iterator past the end");
++__current_;
return *this;
}
@@ -124,6 +127,8 @@ struct __bounded_iter {
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator--() _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __current_ != __begin_, "__bounded_iter::operator--: Attempt to rewind an iterator past the start");
--__current_;
return *this;
}
@@ -134,6 +139,10 @@ struct __bounded_iter {
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator+=(difference_type __n) _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n >= __begin_ - __current_, "__bounded_iter::operator+=: Attempt to rewind an iterator past the start");
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n <= __end_ - __current_, "__bounded_iter::operator+=: Attempt to advance an iterator past the end");
__current_ += __n;
return *this;
}
@@ -151,6 +160,10 @@ struct __bounded_iter {
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator-=(difference_type __n) _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n <= __current_ - __begin_, "__bounded_iter::operator-=: Attempt to rewind an iterator past the start");
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n >= __current_ - __end_, "__bounded_iter::operator-=: Attempt to advance an iterator past the end");
__current_ -= __n;
return *this;
}
@@ -197,15 +210,10 @@ struct __bounded_iter {
}
private:
- // Return whether the given iterator is in the bounds of this __bounded_iter.
- _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __in_bounds(_Iterator const& __iter) const {
- return __iter >= __begin_ && __iter < __end_;
- }
-
template <class>
friend struct pointer_traits;
_Iterator __current_; // current iterator
- _Iterator __begin_, __end_; // valid range represented as [begin, end)
+ _Iterator __begin_, __end_; // valid range represented as [begin, end]
};
template <class _It>
diff --git a/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp b/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
index 360e7a981a0dfe..4a77da56e95e79 100644
--- a/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
+++ b/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
@@ -20,77 +20,149 @@ struct Foo {
int x;
};
+template <typename Iter>
+void test_iterator(Iter begin, Iter end, bool reverse) {
+ ptrdiff_t distance = std::distance(begin, end);
+
+ // Dereferencing an iterator at the end.
+ {
+ TEST_LIBCPP_ASSERT_FAILURE(
+ *end,
+ reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
+#if _LIBCPP_STD_VER >= 20
+ // In C++20 mode, std::reverse_iterator implements operator->, but not operator*, with
+ // std::prev instead of operator--. std::prev ultimately calls operator+
+ TEST_LIBCPP_ASSERT_FAILURE(
+ end->x,
+ reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#else
+ TEST_LIBCPP_ASSERT_FAILURE(
+ end->x,
+ reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#endif
+ }
+
+ // Incrementing an iterator past the end.
+ {
+ const char* msg = reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator++: Attempt to advance an iterator past the end";
+ auto it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(it++, msg);
+ it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(++it, msg);
+ }
+
+ // Decrementing an iterator past the start.
+ {
+ const char* msg = reverse ? "__bounded_iter::operator++: Attempt to advance an iterator past the end"
+ : "__bounded_iter::operator--: Attempt to rewind an iterator past the start";
+ auto it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(it--, msg);
+ it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(--it, msg);
+ }
+
+ // Advancing past the end with operator+= and operator+.
+ {
+ const char* msg = reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator+=: Attempt to advance an iterator past the end";
+ auto it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(it += 1, msg);
+ TEST_LIBCPP_ASSERT_FAILURE(end + 1, msg);
+ it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(it += (distance + 1), msg);
+ TEST_LIBCPP_ASSERT_FAILURE(begin + (distance + 1), msg);
+ }
+
+ // Advancing past the end with operator-= and operator-.
+ {
+ const char* msg = reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator-=: Attempt to advance an iterator past the end";
+ auto it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(it -= (-1), msg);
+ TEST_LIBCPP_ASSERT_FAILURE(end - (-1), msg);
+ it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(it -= (-distance - 1), msg);
+ TEST_LIBCPP_ASSERT_FAILURE(begin - (-distance - 1), msg);
+ }
+
+ // Rewinding past the start with operator+= and operator+.
+ {
+ const char* msg = reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end"
+ : "__bounded_iter::operator+=: Attempt to rewind an iterator past the start";
+ auto it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(it += (-1), msg);
+ TEST_LIBCPP_ASSERT_FAILURE(begin + (-1), msg);
+ it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(it += (-distance - 1), msg);
+ TEST_LIBCPP_ASSERT_FAILURE(end + (-distance - 1), msg);
+ }
+
+ // Rewinding past the start with operator-= and operator-.
+ {
+ const char* msg = reverse ? "__bounded_iter::operator+=: Attempt to advance an iterator past the end"
+ : "__bounded_iter::operator-=: Attempt to rewind an iterator past the start";
+ auto it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(it -= 1, msg);
+ TEST_LIBCPP_ASSERT_FAILURE(begin - 1, msg);
+ it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(it -= (distance + 1), msg);
+ TEST_LIBCPP_ASSERT_FAILURE(end - (distance + 1), msg);
+ }
+
+ // Out-of-bounds operator[].
+ {
+ const char* end_msg = reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator[]: Attempt to index an iterator past the end";
+ const char* past_end_msg =
+ reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator[]: Attempt to index an iterator past the end";
+ const char* past_start_msg =
+ reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end"
+ : "__bounded_iter::operator[]: Attempt to index an iterator past the start";
+ TEST_LIBCPP_ASSERT_FAILURE(begin[distance], end_msg);
+ TEST_LIBCPP_ASSERT_FAILURE(begin[distance + 1], past_end_msg);
+ TEST_LIBCPP_ASSERT_FAILURE(begin[-1], past_start_msg);
+ TEST_LIBCPP_ASSERT_FAILURE(begin[-99], past_start_msg);
+
+ auto it = begin + 1;
+ TEST_LIBCPP_ASSERT_FAILURE(it[distance - 1], end_msg);
+ TEST_LIBCPP_ASSERT_FAILURE(it[distance], past_end_msg);
+ TEST_LIBCPP_ASSERT_FAILURE(it[-2], past_start_msg);
+ TEST_LIBCPP_ASSERT_FAILURE(it[-99], past_start_msg);
+ }
+}
+
int main(int, char**) {
// span<T>::iterator
{
Foo array[] = {{0}, {1}, {2}};
std::span<Foo> const span(array, 3);
- {
- auto it = span.end();
- TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.end();
- TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.begin();
- TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
- }
+ test_iterator(span.begin(), span.end(), /*reverse=*/false);
}
// span<T, N>::iterator
{
Foo array[] = {{0}, {1}, {2}};
std::span<Foo, 3> const span(array, 3);
- {
- auto it = span.end();
- TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.end();
- TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.begin();
- TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
- }
+ test_iterator(span.begin(), span.end(), /*reverse=*/false);
}
// span<T>::reverse_iterator
{
Foo array[] = {{0}, {1}, {2}};
std::span<Foo> const span(array, 3);
- {
- auto it = span.rend();
- TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.rend();
- TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.rbegin();
- TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
+ test_iterator(span.rbegin(), span.rend(), /*reverse=*/true);
}
// span<T, N>::reverse_iterator
{
Foo array[] = {{0}, {1}, {2}};
std::span<Foo, 3> const span(array, 3);
- {
- auto it = span.rend();
- TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.rend();
- TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = span.rbegin();
- TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
+ test_iterator(span.rbegin(), span.rend(), /*reverse=*/true);
}
return 0;
diff --git a/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp b/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp
index 5064319a0aee1b..aa466d0b247cd3 100644
--- a/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp
+++ b/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp
@@ -11,82 +11,141 @@
// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
// UNSUPPORTED: libcpp-hardening-mode=none
+#include <iterator>
#include <string_view>
#include "check_assertion.h"
-int main(int, char**) {
- // string_view::iterator
+template <typename Iter>
+void test_iterator(Iter begin, Iter end, bool reverse) {
+ ptrdiff_t distance = std::distance(begin, end);
+
+ // Dereferencing an iterator at the end.
{
- std::string_view const str("hello world");
- {
- auto it = str.end();
- TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = str.end();
- TEST_LIBCPP_ASSERT_FAILURE(
- it.operator->(), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = str.begin();
- TEST_LIBCPP_ASSERT_FAILURE(it[99], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
- }
+ TEST_LIBCPP_ASSERT_FAILURE(
+ *end,
+ reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
+#if _LIBCPP_STD_VER >= 20
+ // In C++20 mode, std::reverse_iterator implements operator->, but not operator*, with
+ // std::prev instead of operator--. std::prev ultimately calls operator+
+ TEST_LIBCPP_ASSERT_FAILURE(
+ end.operator->(),
+ reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#else
+ TEST_LIBCPP_ASSERT_FAILURE(
+ end.operator->(),
+ reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#endif
}
- // string_view::const_iterator
+ // Incrementing an iterator past the end.
{
- std::string_view const str("hello world");
- {
- auto it = str.cend();
- TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = str.cend();
- TEST_LIBCPP_ASSERT_FAILURE(
- it.operator->(), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = str.cbegin();
- TEST_LIBCPP_ASSERT_FAILURE(it[99], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
- }
+ const char* msg = reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator++: Attempt to advance an iterator past the end";
+ auto it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(it++, msg);
+ it = end;
+ TEST_LIBCPP_ASSERT_FAILURE(++it, msg);
}
- // string_view::reverse_iterator
+ // Decrementing an iterator past the start.
{
- std::string_view const str("hello world");
- {
- auto it = str.rend();
- TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = str.rend();
- TEST_LIBCPP_ASSERT_FAILURE(
- it.operator->(), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
- }
- {
- auto it = str.rbegin();
- TEST_LIBCPP_ASSERT_FAILURE(it[99], "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
- }
+ const char* msg = reverse ? "__bounded_iter::operator++: Attempt to advance an iterator past the end"
+ : "__bounded_iter::operator--: Attempt to rewind an iterator past the start";
+ auto it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(it--, msg);
+ it = begin;
+ TEST_LIBCPP_ASSERT_FAILURE(--it, msg);
}
- // string_view::const_reverse_iterator
+ // Advancing past the end with operator+= and operator+.
+ {
+ const char* msg = reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
+ : "__bounded_iter::operator+=: Attempt to advance an iterator past the end";
+ auto it = end;
+ ...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
See #78882 |
Previously, __bounded_iter only checked operator*. It allowed the pointer to go out of bounds with operator++, etc., and relied on operator* (which checked begin <= current < end) to handle everything. This has several unfortunate consequences: First, pointer arithmetic is UB if it goes out of bounds. So by the time operator* checks, it may be too late and the optimizer may have done something bad. Checking both operations is safer. Second, std::copy and friends currently bypass bounded iterator checks. I think the only hope we have to fix this is to key on iter + n doing a check. See llvm#78771 for further discussion. Note this PR is not sufficient to fix this. It adds the output bounds check, but ends up doing it after the memmove, which is too late. Finally, doing these checks is actually *more* optimizable. See llvm#78829, which is fixed by this PR. Keeping the iterator always in bounds means operator* can rely on some invariants and only needs to check current != end. This aligns better with the ways that callers tend to condition iterator operations. At least for the common case of iterating over the entire See https://godbolt.org/z/vEWrWEf8h for how this new __bounded_iter impacts compiler output. The old __bounded_iter injected checks inside the loops for all the sum() functions, which not only added a check inside a loop, but also impeded Clang's vectorization. The new __bounded_iter allows all the checks to be optimized out and we emit the same code as if it wasn't here. Not everything is ideal however. add_and_deref ends up emitting two comparisons now instead of one. This is because a missed optimization in Clang. I've filed llvm#78875 for that. I suspect (with no data) that this PR is still a net performance win because impeding ranged-for loops is particularly egregious. But ideally we'd fix the optimizer and make add_and_deref fine too. There's also something funny going on with std::ranges::find which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities. Fixes llvm#78829.
132e1ea
to
49d9168
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just spoke with @var-const and this approach is growing on me more and more. I was uneasy about relying on "implicit" optimization hints at first, but the more I look at it the more it seems that these can actually become explicit optimization hints.
So let's say we have
template <random_access_iterator Iterator>
void algorithm(Iterator it, size_t n) {
(void)it + n;
// do something real
}
Without any hardening, we are basically telling the compiler (quite explicitly) that we expect it + n
to be valid, and that it can assume that because otherwise we just invoked UB. With hardening, this is still true except that in addition we now validate that the assumption holds. I find this approach to be nice, generic and explicit, and I like it a lot.
I'd like @var-const to give a thumbs up too before this ships, but this LGTM with a few comments.
libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
Outdated
Show resolved
Hide resolved
// within a [begin, end] range and carries these bounds with it. The iterator | ||
// ensures that it is pointing within [begin, end) range when it is | ||
// dereferenced. It also ensures that it is never iterated outside of | ||
// [begin, end]. | ||
template <class _Iterator, class = __enable_if_t< __libcpp_is_contiguous_iterator<_Iterator>::value > > |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this patch: can we do it for any random access iterator? Is there any reason why we need a contiguous iterator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would work with every random access iterator, in the sense of producing coherent behavior, but there are a couple of subtleties:
-
A general random access iterator (I'm thinking
std::map
) may have slightly different performance properties. In particular, I don't think we'd want to writebegin + n
in<algorithm>
withstd::map
because that'll end up walking the iterator a second time, and I'm not positive the first walk can be easily deleted. And since we don't unwrap the map iterators instd::copy
, it's not necessary to check ahead of time. We can just lean on the wrappedoperator++
check. -
Replacing
iter
withiter2 = bounded(begin, iter, end)
means thatiter2
becomes invalidated wheneverend
gets invalidated, not justiter
. That's fine for contiguous ranges because the end iterator is just another pointer into the allocation. But the more complex containers, irritatingly, have different invalidation rules forend
and general iterators! See Hardening checks should detect dereferencing map.end() and set.end() #80212 (comment) where I considered and then sadly had to reject basically this strategy forstd::map
. -
We may not need to store both
begin
andend
for a general random access iterator. For example, if we ignored the iterator invalidation problem above,std::map
only needs(iter, end)
.begin
is only used to stop an errantoperator--
, and we can already detect that from the tree structure itself. (The only reason we can't detectend
is that libc++ is extra clever and doesn't even haveparent
andright
pointers on the end node at all. So attempting to read them will immediately go OOB.)
So my inclination would be to wait to see how we decide to solve the problem for non-contiguous random access and then make a decision about whether to generalize __bounded_iter
then. My guess would be that we wouldn't end up wanting to reuse it anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random access iterators are required to have O(1) complexity for distance
and other operations like it + n
, so std::map::iterator
is bidirectional, but not random-access. I get your point about maybe not wanting to implement __bounded_iter
the same all the time, but I do think that the properties we use that hold for contiguous iterators also hold for random access iterators. Needs to be confirmed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I hadn't realized random access iterators were not the std::map
ones! (I don't actually know the STL very well at the level of iterator categories.). Yeah, maybe it would be useful to generalize it? I guess we can see when we get to hardening some random-access, non-contiguous iterator. Since __bounded_iter
isn't public API or anything, we only need to care about what libc++'s own containers do.
Either way, as you say, unrelated to this patch. 😁
libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
Outdated
Show resolved
Hide resolved
…indexing.pass.cpp Co-authored-by: Louis Dionne <[email protected]>
Nit: in the description, there seems to be an unfinished sentence:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I like this direction. I'm sorry it's taken me a long time to get to reviewing this patch, but I wanted to give it the attention it deserves. I haven't examined the Godbolt yet, but I wanted to share some initial thoughts in-between.
- Consider the following (bad) code:
void foo(std::span<int> s) {
{
auto i = s.begin() + get_offset();
if (i >= s.end()) return;
// ...
}
{
auto j = s.begin() + get_offset();
if (j > s.end()) j = s.end();
// ...
}
}
I think this is technically UB, but it's likely to be "benign" UB (I call these hobbit iterators because they go there and back again). The current implementation wouldn't reject this case, but with this patch, it will start trapping if get_offset
returns a value beyond bounds. I think part of the reason __bounded_iter
is currently implemented this way is a concern that projects out in the wild might inadvertently rely on "hobbit" iterators, and rejecting them would cause unnecessary friction on adoption. I don't have any actual data on this -- this might be a largely hypothetical concern. But I want to make sure that if we want to break code like this, we do so consciously.
- In general, while it's very important for this type to be performant, I'm a bit concerned if we start to tie the implementation "too much" to the optimizer. I want to be pragmatic here, so obviously some level of attention to the optimizer is necessary. What I'd like to avoid is relying on code that has subtle implicit interactions with the optimizer. For one, I'm concerned that we might end up being tied to the current implementation of the optimizer (and its quirks) and do things that further down the line become unnecessary or even harmful as the optimizer evolves (we also need to take other compilers into account, not only Clang, even though Clang is the more important one). It's also potentially brittle, and we don't currently have any sort of automated benchmarks to catch performance regressions. Specifically, I'm concerned about two things mentioned in this patch:
- an internal assertion that happened to improve performance by providing additional information to the optimizer. IMO, from a higher-level (logical?) perspective, it never makes sense that having an additional check could improve performance (at best, it could have a negligible negative impact). If we ever want to pass some information to the optimizer, we need to write it in a way that makes it explicit (something similar in spirit to
_LIBCPP_ASSUME
). It might expand to anything, including an assertion, under the hood, but the API must make it clear that the intention is to help the optimizer, not e.g. abort the program if the condition doesn't hold. - the
!=
comparison optimizing better than<=
comparison. I don't know if this is a fundamental property that the former optimizes better or it just happens to be so with the current implementation of the optimizer. In either case, the two seem to be largely interchangeable from the logical perspective. We need to be pragmatic here, so we likely need to rely on this, but I just want to see if we can find some way to express this more explicitly (comments definitely help, but perhaps we could find a way to go beyond that).
I want to spend a little more time on the patch just to make sure I don't miss anything, but so far I don't see any significant issues with this approach.
// within a [begin, end] range and carries these bounds with it. The iterator | ||
// ensures that it is pointing within [begin, end) range when it is | ||
// dereferenced. It also ensures that it is never iterated outside of | ||
// [begin, end]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to expand on this a little here?
- Do we want to mention that iterating outside these bounds is potentially undefined behavior? For pointers, it's stated quite explicitly in
[expr.add#4]
:
Otherwise, if
P
points to an array elementi
of an array objectx
withn
elements [...], the expressionsP + J
andJ + P
[...] point to the (possibly-hypothetical) array elementi+j
ofx
if0 ≤ i+j ≤n
[...] Otherwise, the behavior is undefined.
For iterators, it's a little bit less explicit, but I think the best reference is the table of Cpp17InputIterator
requirements which states that the result of incrementing an iterator must be either dereferenceable or point to the past-the-end element (then addition is defined in terms of repeated increment). I can't immediately find the equivalent requirement on input_iterator
, but I presume it must be somewhere (or if it's not, I'd expect it to be an omission).
(Note: I say "potentially" because if we have int a[32]; __bounded_iter i(a, a + 5);
, IIUC calling i + 20
is not UB since it's still within the bounds of the actual underlying array, but the bounded iterator has no way of knowing that so it would still assert)
- Do we want to add any discussion on why it's desirable to disallow creating invalid iterators, even if they are never dereferenced? I don't feel very strongly because
git blame
can always bring a curious reader to this patch with all the discussion and links, but perhaps a brief rationale would still be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is something like this library-level UB?
int array[32];
std::span<int> span(array);
std::span<int> sub = span.subspan(0, 16);
// is this library UB?
auto it = sub.begin() + 20;
*it;
This is definitely not language-level UB because we're still within the int[32]
, but I think this is library-level UB because as far as sub
is concerned, it
is beyond sub.end()
. If this is not library UB, then there's a real potential concern with the overall notion of using bounded iterators with span
cause it will definitely find cases of benign UB in existing code. I am pretty certain we confirmed this was library UB a long time ago but we should confirm again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to expand on this a little here?
Done.
Do we want to add any discussion on why it's desirable to disallow creating invalid iterators,
Also done.
Is something like this library-level UB?
That is a good question! I think it is indeed library-level UB, per the table that @var-const pointed at, but I didn't scour the spec carefully.
Alas, this means that bounded iterators on std::span
and std::string_view
do indeed introduce crashes that weren't strictly necessary. In fact, @danakj and I just ran into an analogous issue with std::string_view
iterators in Chromium and have to go back and rewrite it! 😁 However, I cannot think of any reasonable way to bounds-check iterators without paying that one, so I think we'll just have to run with it and hope it's not too bad.
Either way, I think this is unrelated to this PR. The existing bounded iterators design already breaks that code. (Which is how we managed to run into it.) This PR does newly break cases where you compute sub.begin() + 20
without dereferencing it, but I think almost all code that relies on interchanging subspan or substr iterators is also going to dereference it, so that delta isn't likely to be meaningful.
@@ -151,6 +162,10 @@ struct __bounded_iter { | |||
} | |||
|
|||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator-=(difference_type __n) _NOEXCEPT { | |||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be any value in implementing this in terms of operator+=
? Or would that make the call stack in case of a failed assertion too confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that requires negating __n
, which may hit UB in itself. :-(
libcxx/test/libcxx/containers/views/views.span/assert.iterator-indexing.pass.cpp
Outdated
Show resolved
Hide resolved
}; | ||
|
||
template <typename Iter> | ||
void test_iterator(Iter begin, Iter end, bool reverse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The errors produced by a reverse iterator can be a bit confusing for the user. @ldionne Do you think this is worth the trouble to try to fix? I suppose we could implement a separate checked reverse iterator type, but perhaps that's overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think the test is fine as-is, those are implementation specific anyway, so this is good enough (IMO).
Thanks! Rephrased.
I actually disagree that this is "benign" UB. When this function is inlined at a point where the compiler can see the bounds of the allocation, the out-of-bounds pointer arithmetic is very visible to the compiler. The consequence of this analysis is that the compiler learns bounds on the pointers, which would then allow it to delete the bounds-checks we wanted to add. Moreover, by assuming this doesn't happen, we put ourselves in a very precarious position. We want the optimizer to delete redundant bounds checks, and we want to avoid, as you say, tying the implementation too much to one particular optimizer. Those pressures mean we will want optimizers to get increasingly good at bounds analysis. But if we exhibit this UB, we will simultaneously want them to not get too good at it. I don't think that's a good place to be in. Indeed, see #78784. That is the missing observation that would have allowed Clang to optimize the current I would also argue that the additional checks are actually more optimizer-agnostic than the current ones, but I'll discuss that below.
Absolutely. I think should consciously break code like this:
I think those three are strong evidence that this change is worth the risk of impacting hobbit iterators.
Could you elaborate a bit more on this concern? To me, assertions that are more straightforwardly deletable are less tied to the optimizer, because it's more likely to be universal. Whereas assertions that rely on more complex reasoning that our optimizer happen to handle would be more tied to it. This PR improves the situation by this criteria!
Hmm, I'm not quite following this one. Could you elaborate? I suspect we may not be on the same page about why this new version is easier to optimize, so let me continue on and perhaps that'll help?
I would push back a bit on this. I don't think this higher-level intuition is correct. The additional checks move the assertions closer to their preconditions. That is inherently easier for any optimizer to handle.
So, as discussed above, we specifically want to abort the program because otherwise we'll trigger non-benign UB and get ourselves into problems. But also
(The comparison is actually This is a fundamental property, both of how C++ is used and just what it means to optimize a check out. In order to delete a check, be it Moreover, the way C++ iterators work,
In the current
This is not an easy theorem to prove. In order to delete the check, the compiler must know that Now consider the new
This is trivial to optimize. The compiler does not need any complex analysis at all, because we've the assertions exactly match the preconditions that the caller is already expected to use. Now, in more complicated cases, we need the compiler to infer |
Aha, found it! I thought I'd written all the above already before. See also #78771 (comment), which discussed this already, but probably got lost in a random other issue. |
A lot of existing files were not formatted correctly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to comment only with nits. Trying to be helpful despite a lack of technical expertise in the area. Thank you for doing this hard work!
Co-authored-by: Will Hawkins <[email protected]>
Thank you for writing up the detailed rationale! I'm convinced, and I think it's very valuable to have this decision written for potential future reference.
I think this is a very good way to put it.
Putting this and the part I quoted above, it seems to come down to "Ensuring that bounded iterators are always valid (point to a valid array element or to the one-past-the-end element) is a stronger invariant, and thus can be expected to optimize better and should never optimize worse". This seems like a good intuitive rationale to me (but please correct me if I'm simplifying things too much here).
There was a comment (sorry, I can't immediately find it, might have been a different PR or issue) that mentioned that the following assertion in the constructor of the bounded iterator: _LIBCPP_ASSERT_INTERNAL(__begin <= __end, "__bounded_iter(current, begin, end): [begin, end) is not a valid range"); used to be enabled in the fast mode and lead to better codegen (because it established an invariant that the optimizer could rely on). My concern was prompted by this specific part, but it's more general (and, to be clear, it's hypothetical -- this is a potential situation that I hope we won't find ourselves in, not a comment on the changes in this patch).
What I'm trying to express is that, from the language specification perspective, I think it's fair to say the optimizer doesn't exist. Or put another way, the Standard deals with the abstract machine, and IIRC, the abstract machine doesn't really have a notion of an optimizer, or at least not in any well-defined, structured way. Of course, the abstract machine is a big simplification that doesn't also have a notion of many other vital things -- this is in no way to say that the optimizer isn't important. But I think it provides a useful level of abstraction, and at that level of abstraction it makes sense to talk about the overhead or performance of individual operations, but I think it's fair to say there is no way one operation can indirectly affect another. In that regard, it makes complete sense to me to say "between these two possible ways of writing a certain assertion, this one would have better performance". This analysis might break the abstraction a little bit, but at the end of the day we know that the optimizer is there, so I think that's quite acceptable. Fundamentally, we're still within the model where we analyze the effect (including the performance effect) of standalone independent statements. However, I feel that a statement like "Adding a condition statement here has an indirect effect on the following statement and makes it more efficient" is simply incompatible with this model. This is the source of my concern. At the same time, of course we have to be pragmatic and performance is crucial here. I just feel that if we need to start relying on indirect effects of some statements (not in this patch, but it seems likely that this will happen in the future) -- "indirect" from the perspective of the abstract machine -- we need to find some way to formalize it and make it visible to the reader. When I read an assertion, I presume it literally means "Check the condition and halt the program if it doesn't hold". In reality, it might also provide crucial information to the optimizer, but that's an indirect effect that's invisible at the source code level. Comments can help a lot, but if we can think of a way to make the actual intent visible in the code, that would be great. Once again, this is just something that I think we should be mindful of as we go, not an immediate concern with this patch. |
LGTM; thanks a lot for working on this! (FWIW, this is something I wanted to do for a long time, and I'm very happy to see the detailed analysis in this patch that shows that this approach indeed results in better performance) There is one thing that would be awesome to check, but I don't want to block the patch on this. It would be great to see how well this patch plays with GCC. To be clear, for our purposes Clang is the primary compiler. However, the comparison might be valuable to double-check our reasoning about the "optimizability". If we notice something that optimizes better on Clang but worse on GCC, it might be a GCC issue (which we don't need to worry about), but it might also indicate that we rely too much on the specifics of the Clang optimizer. So I don't think it makes sense to do a very thorough analysis, but a quick "smoke check" might be interesting. There is a way to use our implementation with GCC on Godbolt (I learned this from @philnik777) -- essentially you can use the Once again, I absolutely don't think we should spend a lot of time on this, but it might be valuable if you could take a quick look in case anything stands out. |
On phone so haven't looked carefully, but copy_span is a bit misleading of a baseline. I actually included it to demonstrate a different problem: we're not supposed to optimize everything out! For bounds safety, there should be a check that the destination is at least as large as the source. But the unwrapping mechanism in libc++ ends up bypassing the safety check before the compiler even sees it. This is #78771. With this PR, the rewrap step ends up restoring the check, sort of. It happens after the copy instead of before, which is too late. I was thinking we'd fix that by first doing this, then adding some iterator arithmetic in front to get the check in the right order. But I haven't looked very carefully yet at the codegen just because it's still not doing the right thing anyway. |
@davidben Are we waiting for something specific before we merge this? I think we'd be fine with merging this as-is and making any required improvements as follow-ups, since I don't think the performance characteristics of GCC would realistically change the outcome of this patch. |
@ldionne Oh I was going to ask you all the same thing. :-) Yeah, nothing on my end. As far as I'm concerned, this is all ready to merge! I left a comment above for why |
I just updated it on to the latest |
The CI failure looks unrelated. |
The one CI failure is just an infrastructure issue, the test suite ran fine. I'll merge now -- thanks a lot for the patch! |
…ing (#78929) ~~NB: This PR depends on #78876. Ignore the first commit when reviewing, and don't merge it until #78876 is resolved. When/if #78876 lands, I'll clean this up.~~ This partially restores parity with the old, since removed debug build. We now can re-enable a bunch of the disabled tests. Some things of note: - `bounded_iter`'s converting constructor has never worked. It needs a friend declaration to access the other `bound_iter` instantiation's private fields. - The old debug iterators also checked that callers did not try to compare iterators from different objects. `bounded_iter` does not currently do this, so I've left those disabled. However, I think we probably should add those. See #78771 (comment) - The `std::vector` iterators are bounded up to capacity, not size. This makes for a weaker safety check. This is because the STL promises not to invalidate iterators when appending up to the capacity. Since we cannot retroactively update all the iterators on `push_back()`, I've instead sized it to the capacity. This is not as good, but at least will stop the iterator from going off the end of the buffer. There was also no test for this, so I've added one in the `std` directory. - `std::string` has two ambiguities to deal with. First, I opted not to size it against the capacity. https://eel.is/c++draft/string.require#4 says iterators are invalidated on an non-const operation. Second, whether the iterator can reach the NUL terminator. The previous debug tests and the special-case in https://eel.is/c++draft/string.access#2 suggest no. If either of these causes widespread problems, I figure we can revisit. - `resize_and_overwrite.pass.cpp` assumed `std::string`'s iterator supported `s.begin().base()`, but I see no promise of this in the standard. GCC also doesn't support this. I fixed the test to use `std::to_address`. - `alignof.compile.pass.cpp`'s pointer isn't enough of a real pointer. (It needs to satisfy `NullablePointer`, `LegacyRandomAccessIterator`, and `LegacyContiguousIterator`.) `__bounded_iter` seems to instantiate enough to notice. I've added a few more bits to satisfy it. Fixes #78805
…ing (#78929) ~~NB: This PR depends on #78876. Ignore the first commit when reviewing, and don't merge it until #78876 is resolved. When/if #78876 lands, I'll clean this up.~~ This partially restores parity with the old, since removed debug build. We now can re-enable a bunch of the disabled tests. Some things of note: - `bounded_iter`'s converting constructor has never worked. It needs a friend declaration to access the other `bound_iter` instantiation's private fields. - The old debug iterators also checked that callers did not try to compare iterators from different objects. `bounded_iter` does not currently do this, so I've left those disabled. However, I think we probably should add those. See #78771 (comment) - The `std::vector` iterators are bounded up to capacity, not size. This makes for a weaker safety check. This is because the STL promises not to invalidate iterators when appending up to the capacity. Since we cannot retroactively update all the iterators on `push_back()`, I've instead sized it to the capacity. This is not as good, but at least will stop the iterator from going off the end of the buffer. There was also no test for this, so I've added one in the `std` directory. - `std::string` has two ambiguities to deal with. First, I opted not to size it against the capacity. https://eel.is/c++draft/string.require#4 says iterators are invalidated on an non-const operation. Second, whether the iterator can reach the NUL terminator. The previous debug tests and the special-case in https://eel.is/c++draft/string.access#2 suggest no. If either of these causes widespread problems, I figure we can revisit. - `resize_and_overwrite.pass.cpp` assumed `std::string`'s iterator supported `s.begin().base()`, but I see no promise of this in the standard. GCC also doesn't support this. I fixed the test to use `std::to_address`. - `alignof.compile.pass.cpp`'s pointer isn't enough of a real pointer. (It needs to satisfy `NullablePointer`, `LegacyRandomAccessIterator`, and `LegacyContiguousIterator`.) `__bounded_iter` seems to instantiate enough to notice. I've added a few more bits to satisfy it. Fixes #78805
Previously,
__bounded_iter
only checkedoperator*
. It allowed the pointer to go out of bounds withoperator++
, etc., and relied onoperator*
(which checkedbegin <= current < end
) to handle everything. This has several unfortunate consequences:First, pointer arithmetic is UB if it goes out of bounds. So by the time
operator*
checks, it may be too late and the optimizer may have done something bad. Checking both operations is safer.Second,
std::copy
and friends currently bypass bounded iterator checks. I think the only hope we have to fix this is to key oniter + n
doing a check. See #78771 for further discussion. Note this PR is not sufficient to fix this. It adds the output bounds check, but ends up doing it after thememmove
, which is too late.Finally, doing these checks is actually more optimizable. See #78829, which is fixed by this PR. Keeping the iterator always in bounds means
operator*
can rely on some invariants and only needs to checkcurrent != end
. This aligns better with common iterator patterns, which use!=
instead of<
, so it's easier to delete checks with local reasoning.See https://godbolt.org/z/vEWrWEf8h for how this new
__bounded_iter
impacts compiler output. The old__bounded_iter
injected checks inside the loops for all thesum()
functions, which not only added a check inside a loop, but also impeded Clang's vectorization. The new__bounded_iter
allows all the checks to be optimized out and we emit the same code as if it wasn't here.Not everything is ideal however.
add_and_deref
ends up emitting two comparisons now instead of one. This is because a missed optimization in Clang. I've filed #78875 for that. I suspect (with no data) that this PR is still a net performance win because impeding ranged-for loops is particularly egregious. But ideally we'd fix the optimizer and makeadd_and_deref
fine too.There's also something funny going on with
std::ranges::find
which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities.Fixes #78829.
(CC @danakj)