Skip to content

[libc++][hardening] Check bounds on arithmetic in __bounded_iter #78876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 12, 2024

Conversation

davidben
Copy link
Contributor

@davidben davidben commented Jan 21, 2024

Previously, __bounded_iter only checked operator*. It allowed the pointer to go out of bounds with operator++, etc., and relied on operator* (which checked begin <= current < end) to handle everything. This has several unfortunate consequences:

First, pointer arithmetic is UB if it goes out of bounds. So by the time operator* checks, it may be too late and the optimizer may have done something bad. Checking both operations is safer.

Second, std::copy and friends currently bypass bounded iterator checks. I think the only hope we have to fix this is to key on iter + n doing a check. See #78771 for further discussion. Note this PR is not sufficient to fix this. It adds the output bounds check, but ends up doing it after the memmove, which is too late.

Finally, doing these checks is actually more optimizable. See #78829, which is fixed by this PR. Keeping the iterator always in bounds means operator* can rely on some invariants and only needs to check current != end. This aligns better with common iterator patterns, which use != instead of <, so it's easier to delete checks with local reasoning.

See https://godbolt.org/z/vEWrWEf8h for how this new __bounded_iter impacts compiler output. The old __bounded_iter injected checks inside the loops for all the sum() functions, which not only added a check inside a loop, but also impeded Clang's vectorization. The new __bounded_iter allows all the checks to be optimized out and we emit the same code as if it wasn't here.

Not everything is ideal however. add_and_deref ends up emitting two comparisons now instead of one. This is because a missed optimization in Clang. I've filed #78875 for that. I suspect (with no data) that this PR is still a net performance win because impeding ranged-for loops is particularly egregious. But ideally we'd fix the optimizer and make add_and_deref fine too.

There's also something funny going on with std::ranges::find which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities.

Fixes #78829.

(CC @danakj)

@davidben davidben requested a review from a team as a code owner January 21, 2024 04:35
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jan 21, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 21, 2024

@llvm/pr-subscribers-libcxx

Author: David Benjamin (davidben)

Changes

Previously, __bounded_iter only checked operator*. It allowed the pointer to go out of bounds with operator++, etc., and relied on operator* (which checked begin <= current < end) to handle everything. This has several unfortunate consequences:

First, pointer arithmetic is UB if it goes out of bounds. So by the time operator* checks, it may be too late and the optimizer may have done something bad. Checking both operations is safer.

Second, std::copy and friends currently bypass bounded iterator checks. I think the only hope we have to fix this is to key on iter + n doing a check. See #78771 for further discussion. Note this PR is not sufficient to fix this. It adds the output bounds check, but ends up doing it after the memmove, which is too late.

Finally, doing these checks is actually more optimizable. See #78829, which is fixed by this PR. Keeping the iterator always in bounds means operator* can rely on some invariants and only needs to check current != end. This aligns better with the ways that callers tend to condition iterator operations. At least for the common case of iterating over the entire

See https://godbolt.org/z/vEWrWEf8h for how this new __bounded_iter impacts compiler output. The old __bounded_iter injected checks inside the loops for all the sum() functions, which not only added a check inside a loop, but also impeded Clang's vectorization. The new __bounded_iter allows all the checks to be optimized out and we emit the same code as if it wasn't here.

Not everything is ideal however. add_and_deref ends up emitting two comparisons now instead of one. This is because a missed optimization in Clang. I've filed #78875 for that. I suspect (with no data) that this PR is still a net performance win because impeding ranged-for loops is particularly egregious. But ideally we'd fix the optimizer and make add_and_deref fine too.

There's also something funny going on with std::ranges::find which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities.

Fixes #78829.

(CC @danakj)


Patch is 24.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78876.diff

3 Files Affected:

  • (modified) libcxx/include/__iterator/bounded_iter.h (+29-21)
  • (modified) libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp (+120-48)
  • (modified) libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp (+120-61)
diff --git a/libcxx/include/__iterator/bounded_iter.h b/libcxx/include/__iterator/bounded_iter.h
index 2a667648871c4c..3ea20734ac66cb 100644
--- a/libcxx/include/__iterator/bounded_iter.h
+++ b/libcxx/include/__iterator/bounded_iter.h
@@ -31,13 +31,10 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 // Iterator wrapper that carries the valid range it is allowed to access.
 //
 // This is a simple iterator wrapper for contiguous iterators that points
-// within a [begin, end) range and carries these bounds with it. The iterator
-// ensures that it is pointing within that [begin, end) range when it is
-// dereferenced.
-//
-// Arithmetic operations are allowed and the bounds of the resulting iterator
-// are not checked. Hence, it is possible to create an iterator pointing outside
-// its range, but it is not possible to dereference it.
+// within a [begin, end] range and carries these bounds with it. The iterator
+// ensures that it is pointing within [begin, end) range when it is
+// dereferenced. It also ensures that it is never iterated outside of
+// [begin, end].
 template <class _Iterator, class = __enable_if_t< __libcpp_is_contiguous_iterator<_Iterator>::value > >
 struct __bounded_iter {
   using value_type        = typename iterator_traits<_Iterator>::value_type;
@@ -70,18 +67,18 @@ struct __bounded_iter {
 
 private:
   // Create an iterator wrapping the given iterator, and whose bounds are described
-  // by the provided [begin, end) range.
+  // by the provided [begin, end] range.
   //
-  // This constructor does not check whether the resulting iterator is within its bounds.
-  // However, it does check that the provided [begin, end) range is a valid range (that
-  // is, begin <= end).
+  // Except in debug builds, the constructor does not check whether the resulting iterator
+  // is within its bounds. The container is assumed to be self-consistent.
   //
   // Since it is non-standard for iterators to have this constructor, __bounded_iter must
   // be created via `std::__make_bounded_iter`.
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 explicit __bounded_iter(
       _Iterator __current, _Iterator __begin, _Iterator __end)
       : __current_(__current), __begin_(__begin), __end_(__end) {
-    _LIBCPP_ASSERT_INTERNAL(__begin <= __end, "__bounded_iter(current, begin, end): [begin, end) is not a valid range");
+    _LIBCPP_ASSERT_INTERNAL(__begin <= __current, "__bounded_iter(current, begin, end): current and begin are inconsistent");
+    _LIBCPP_ASSERT_INTERNAL(__current <= __end, "__bounded_iter(current, begin, end): current and end are inconsistent");
   }
 
   template <class _It>
@@ -91,21 +88,25 @@ struct __bounded_iter {
   // Dereference and indexing operations.
   //
   // These operations check that the iterator is dereferenceable, that is within [begin, end).
+  // The iterator is always within [begin, end], so we only need to check for end. This is
+  // easier to optimize out. See https://github.com/llvm/llvm-project/issues/78829
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator*() const _NOEXCEPT {
     _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __in_bounds(__current_), "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
+        __current_ != __end_, "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
     return *__current_;
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 pointer operator->() const _NOEXCEPT {
     _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __in_bounds(__current_), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
+        __current_ != __end_, "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
     return std::__to_address(__current_);
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator[](difference_type __n) const _NOEXCEPT {
     _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __in_bounds(__current_ + __n), "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
+        __n >= __begin_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator past the start");
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __n < __end_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator past the end");
     return __current_[__n];
   }
 
@@ -114,6 +115,8 @@ struct __bounded_iter {
   // These operations do not check that the resulting iterator is within the bounds, since that
   // would make it impossible to create a past-the-end iterator.
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator++() _NOEXCEPT {
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __current_ != __end_, "__bounded_iter::operator++: Attempt to advance an iterator past the end");
     ++__current_;
     return *this;
   }
@@ -124,6 +127,8 @@ struct __bounded_iter {
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator--() _NOEXCEPT {
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __current_ != __begin_, "__bounded_iter::operator--: Attempt to rewind an iterator past the start");
     --__current_;
     return *this;
   }
@@ -134,6 +139,10 @@ struct __bounded_iter {
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator+=(difference_type __n) _NOEXCEPT {
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __n >= __begin_ - __current_, "__bounded_iter::operator+=: Attempt to rewind an iterator past the start");
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __n <= __end_ - __current_, "__bounded_iter::operator+=: Attempt to advance an iterator past the end");
     __current_ += __n;
     return *this;
   }
@@ -151,6 +160,10 @@ struct __bounded_iter {
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator-=(difference_type __n) _NOEXCEPT {
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __n <= __current_ - __begin_, "__bounded_iter::operator-=: Attempt to rewind an iterator past the start");
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __n >= __current_ - __end_, "__bounded_iter::operator-=: Attempt to advance an iterator past the end");
     __current_ -= __n;
     return *this;
   }
@@ -197,15 +210,10 @@ struct __bounded_iter {
   }
 
 private:
-  // Return whether the given iterator is in the bounds of this __bounded_iter.
-  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __in_bounds(_Iterator const& __iter) const {
-    return __iter >= __begin_ && __iter < __end_;
-  }
-
   template <class>
   friend struct pointer_traits;
   _Iterator __current_;       // current iterator
-  _Iterator __begin_, __end_; // valid range represented as [begin, end)
+  _Iterator __begin_, __end_; // valid range represented as [begin, end]
 };
 
 template <class _It>
diff --git a/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp b/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
index 360e7a981a0dfe..4a77da56e95e79 100644
--- a/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
+++ b/libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
@@ -20,77 +20,149 @@ struct Foo {
     int x;
 };
 
+template <typename Iter>
+void test_iterator(Iter begin, Iter end, bool reverse) {
+  ptrdiff_t distance = std::distance(begin, end);
+
+  // Dereferencing an iterator at the end.
+  {
+    TEST_LIBCPP_ASSERT_FAILURE(
+        *end,
+        reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+                : "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
+#if _LIBCPP_STD_VER >= 20
+    // In C++20 mode, std::reverse_iterator implements operator->, but not operator*, with
+    // std::prev instead of operator--. std::prev ultimately calls operator+
+    TEST_LIBCPP_ASSERT_FAILURE(
+        end->x,
+        reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
+                : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#else
+    TEST_LIBCPP_ASSERT_FAILURE(
+        end->x,
+        reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+                : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#endif
+  }
+
+  // Incrementing an iterator past the end.
+  {
+    const char* msg = reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+                              : "__bounded_iter::operator++: Attempt to advance an iterator past the end";
+    auto it         = end;
+    TEST_LIBCPP_ASSERT_FAILURE(it++, msg);
+    it = end;
+    TEST_LIBCPP_ASSERT_FAILURE(++it, msg);
+  }
+
+  // Decrementing an iterator past the start.
+  {
+    const char* msg = reverse ? "__bounded_iter::operator++: Attempt to advance an iterator past the end"
+                              : "__bounded_iter::operator--: Attempt to rewind an iterator past the start";
+    auto it         = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(it--, msg);
+    it = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(--it, msg);
+  }
+
+  // Advancing past the end with operator+= and operator+.
+  {
+    const char* msg = reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
+                              : "__bounded_iter::operator+=: Attempt to advance an iterator past the end";
+    auto it         = end;
+    TEST_LIBCPP_ASSERT_FAILURE(it += 1, msg);
+    TEST_LIBCPP_ASSERT_FAILURE(end + 1, msg);
+    it = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(it += (distance + 1), msg);
+    TEST_LIBCPP_ASSERT_FAILURE(begin + (distance + 1), msg);
+  }
+
+  // Advancing past the end with operator-= and operator-.
+  {
+    const char* msg = reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
+                              : "__bounded_iter::operator-=: Attempt to advance an iterator past the end";
+    auto it         = end;
+    TEST_LIBCPP_ASSERT_FAILURE(it -= (-1), msg);
+    TEST_LIBCPP_ASSERT_FAILURE(end - (-1), msg);
+    it = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(it -= (-distance - 1), msg);
+    TEST_LIBCPP_ASSERT_FAILURE(begin - (-distance - 1), msg);
+  }
+
+  // Rewinding past the start with operator+= and operator+.
+  {
+    const char* msg = reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end"
+                              : "__bounded_iter::operator+=: Attempt to rewind an iterator past the start";
+    auto it         = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(it += (-1), msg);
+    TEST_LIBCPP_ASSERT_FAILURE(begin + (-1), msg);
+    it = end;
+    TEST_LIBCPP_ASSERT_FAILURE(it += (-distance - 1), msg);
+    TEST_LIBCPP_ASSERT_FAILURE(end + (-distance - 1), msg);
+  }
+
+  // Rewinding past the start with operator-= and operator-.
+  {
+    const char* msg = reverse ? "__bounded_iter::operator+=: Attempt to advance an iterator past the end"
+                              : "__bounded_iter::operator-=: Attempt to rewind an iterator past the start";
+    auto it         = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(it -= 1, msg);
+    TEST_LIBCPP_ASSERT_FAILURE(begin - 1, msg);
+    it = end;
+    TEST_LIBCPP_ASSERT_FAILURE(it -= (distance + 1), msg);
+    TEST_LIBCPP_ASSERT_FAILURE(end - (distance + 1), msg);
+  }
+
+  // Out-of-bounds operator[].
+  {
+    const char* end_msg = reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+                                  : "__bounded_iter::operator[]: Attempt to index an iterator past the end";
+    const char* past_end_msg =
+        reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
+                : "__bounded_iter::operator[]: Attempt to index an iterator past the end";
+    const char* past_start_msg =
+        reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end"
+                : "__bounded_iter::operator[]: Attempt to index an iterator past the start";
+    TEST_LIBCPP_ASSERT_FAILURE(begin[distance], end_msg);
+    TEST_LIBCPP_ASSERT_FAILURE(begin[distance + 1], past_end_msg);
+    TEST_LIBCPP_ASSERT_FAILURE(begin[-1], past_start_msg);
+    TEST_LIBCPP_ASSERT_FAILURE(begin[-99], past_start_msg);
+
+    auto it = begin + 1;
+    TEST_LIBCPP_ASSERT_FAILURE(it[distance - 1], end_msg);
+    TEST_LIBCPP_ASSERT_FAILURE(it[distance], past_end_msg);
+    TEST_LIBCPP_ASSERT_FAILURE(it[-2], past_start_msg);
+    TEST_LIBCPP_ASSERT_FAILURE(it[-99], past_start_msg);
+  }
+}
+
 int main(int, char**) {
     // span<T>::iterator
     {
         Foo array[] = {{0}, {1}, {2}};
         std::span<Foo> const span(array, 3);
-        {
-            auto it = span.end();
-            TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.end();
-            TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.begin();
-            TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
-        }
+        test_iterator(span.begin(), span.end(), /*reverse=*/false);
     }
 
     // span<T, N>::iterator
     {
         Foo array[] = {{0}, {1}, {2}};
         std::span<Foo, 3> const span(array, 3);
-        {
-            auto it = span.end();
-            TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.end();
-            TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.begin();
-            TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
-        }
+        test_iterator(span.begin(), span.end(), /*reverse=*/false);
     }
 
     // span<T>::reverse_iterator
     {
         Foo array[] = {{0}, {1}, {2}};
         std::span<Foo> const span(array, 3);
-        {
-            auto it = span.rend();
-            TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.rend();
-            TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.rbegin();
-            TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-        }
+        test_iterator(span.rbegin(), span.rend(), /*reverse=*/true);
     }
 
     // span<T, N>::reverse_iterator
     {
         Foo array[] = {{0}, {1}, {2}};
         std::span<Foo, 3> const span(array, 3);
-        {
-            auto it = span.rend();
-            TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.rend();
-            TEST_LIBCPP_ASSERT_FAILURE(it->x, "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
-        }
-        {
-            auto it = span.rbegin();
-            TEST_LIBCPP_ASSERT_FAILURE(it[3], "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-        }
+        test_iterator(span.rbegin(), span.rend(), /*reverse=*/true);
     }
 
     return 0;
diff --git a/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp b/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp
index 5064319a0aee1b..aa466d0b247cd3 100644
--- a/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp
+++ b/libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp
@@ -11,82 +11,141 @@
 // REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
 // UNSUPPORTED: libcpp-hardening-mode=none
 
+#include <iterator>
 #include <string_view>
 
 #include "check_assertion.h"
 
-int main(int, char**) {
-  // string_view::iterator
+template <typename Iter>
+void test_iterator(Iter begin, Iter end, bool reverse) {
+  ptrdiff_t distance = std::distance(begin, end);
+
+  // Dereferencing an iterator at the end.
   {
-    std::string_view const str("hello world");
-    {
-      auto it = str.end();
-      TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-    }
-    {
-      auto it = str.end();
-      TEST_LIBCPP_ASSERT_FAILURE(
-          it.operator->(), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
-    }
-    {
-      auto it = str.begin();
-      TEST_LIBCPP_ASSERT_FAILURE(it[99], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
-    }
+    TEST_LIBCPP_ASSERT_FAILURE(
+        *end,
+        reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+                : "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
+#if _LIBCPP_STD_VER >= 20
+    // In C++20 mode, std::reverse_iterator implements operator->, but not operator*, with
+    // std::prev instead of operator--. std::prev ultimately calls operator+
+    TEST_LIBCPP_ASSERT_FAILURE(
+        end.operator->(),
+        reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"
+                : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#else
+    TEST_LIBCPP_ASSERT_FAILURE(
+        end.operator->(),
+        reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+                : "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
+#endif
   }
 
-  // string_view::const_iterator
+  // Incrementing an iterator past the end.
   {
-    std::string_view const str("hello world");
-    {
-      auto it = str.cend();
-      TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-    }
-    {
-      auto it = str.cend();
-      TEST_LIBCPP_ASSERT_FAILURE(
-          it.operator->(), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
-    }
-    {
-      auto it = str.cbegin();
-      TEST_LIBCPP_ASSERT_FAILURE(it[99], "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
-    }
+    const char* msg = reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start"
+                              : "__bounded_iter::operator++: Attempt to advance an iterator past the end";
+    auto it         = end;
+    TEST_LIBCPP_ASSERT_FAILURE(it++, msg);
+    it = end;
+    TEST_LIBCPP_ASSERT_FAILURE(++it, msg);
   }
 
-  // string_view::reverse_iterator
+  // Decrementing an iterator past the start.
   {
-    std::string_view const str("hello world");
-    {
-      auto it = str.rend();
-      TEST_LIBCPP_ASSERT_FAILURE(*it, "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-    }
-    {
-      auto it = str.rend();
-      TEST_LIBCPP_ASSERT_FAILURE(
-          it.operator->(), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
-    }
-    {
-      auto it = str.rbegin();
-      TEST_LIBCPP_ASSERT_FAILURE(it[99], "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
-    }
+    const char* msg = reverse ? "__bounded_iter::operator++: Attempt to advance an iterator past the end"
+                              : "__bounded_iter::operator--: Attempt to rewind an iterator past the start";
+    auto it         = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(it--, msg);
+    it = begin;
+    TEST_LIBCPP_ASSERT_FAILURE(--it, msg);
   }
 
-  // string_view::const_reverse_iterator
+  // Advancing past the end with operator+= and operator+.
+  {
+    const char* msg = reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"
+                              : "__bounded_iter::operator+=: Attempt to advance an iterator past the end";
+    auto it         = end;
+ ...
[truncated]

Copy link

github-actions bot commented Jan 21, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@davidben
Copy link
Contributor Author

There's also something funny going on with std::ranges::find which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities.

See #78882

Previously, __bounded_iter only checked operator*. It allowed the
pointer to go out of bounds with operator++, etc., and relied on
operator* (which checked begin <= current < end) to handle everything.
This has several unfortunate consequences:

First, pointer arithmetic is UB if it goes out of bounds. So by the time
operator* checks, it may be too late and the optimizer may have done
something bad. Checking both operations is safer.

Second, std::copy and friends currently bypass bounded iterator checks.
I think the only hope we have to fix this is to key on iter + n doing a
check. See llvm#78771 for further discussion. Note this PR is not sufficient
to fix this. It adds the output bounds check, but ends up doing it after
the memmove, which is too late.

Finally, doing these checks is actually *more* optimizable. See llvm#78829,
which is fixed by this PR. Keeping the iterator always in bounds means
operator* can rely on some invariants and only needs to check
current != end. This aligns better with the ways that callers tend to
condition iterator operations. At least for the common case of iterating
over the entire

See https://godbolt.org/z/vEWrWEf8h for how this new __bounded_iter
impacts compiler output. The old __bounded_iter injected checks inside
the loops for all the sum() functions, which not only added a check
inside a loop, but also impeded Clang's vectorization. The new
__bounded_iter allows all the checks to be optimized out and we emit the
same code as if it wasn't here.

Not everything is ideal however. add_and_deref ends up emitting two
comparisons now instead of one. This is because a missed optimization in
Clang. I've filed llvm#78875 for that. I suspect (with no data)  that this
PR is still a net performance win because impeding ranged-for loops is
particularly egregious. But ideally we'd fix the optimizer and make
add_and_deref fine too.

There's also something funny going on with std::ranges::find which I
have not yet figured out yet, but I suspect there are some further
missed optimization opportunities.

Fixes llvm#78829.
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just spoke with @var-const and this approach is growing on me more and more. I was uneasy about relying on "implicit" optimization hints at first, but the more I look at it the more it seems that these can actually become explicit optimization hints.

So let's say we have

template <random_access_iterator Iterator>
void algorithm(Iterator it, size_t n) {
  (void)it + n;

  // do something real
}

Without any hardening, we are basically telling the compiler (quite explicitly) that we expect it + n to be valid, and that it can assume that because otherwise we just invoked UB. With hardening, this is still true except that in addition we now validate that the assumption holds. I find this approach to be nice, generic and explicit, and I like it a lot.

I'd like @var-const to give a thumbs up too before this ships, but this LGTM with a few comments.

// within a [begin, end] range and carries these bounds with it. The iterator
// ensures that it is pointing within [begin, end) range when it is
// dereferenced. It also ensures that it is never iterated outside of
// [begin, end].
template <class _Iterator, class = __enable_if_t< __libcpp_is_contiguous_iterator<_Iterator>::value > >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this patch: can we do it for any random access iterator? Is there any reason why we need a contiguous iterator?

Copy link
Contributor Author

@davidben davidben Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would work with every random access iterator, in the sense of producing coherent behavior, but there are a couple of subtleties:

  1. A general random access iterator (I'm thinking std::map) may have slightly different performance properties. In particular, I don't think we'd want to write begin + n in <algorithm> with std::map because that'll end up walking the iterator a second time, and I'm not positive the first walk can be easily deleted. And since we don't unwrap the map iterators in std::copy, it's not necessary to check ahead of time. We can just lean on the wrapped operator++ check.

  2. Replacing iter with iter2 = bounded(begin, iter, end) means that iter2 becomes invalidated whenever end gets invalidated, not just iter. That's fine for contiguous ranges because the end iterator is just another pointer into the allocation. But the more complex containers, irritatingly, have different invalidation rules for end and general iterators! See Hardening checks should detect dereferencing map.end() and set.end() #80212 (comment) where I considered and then sadly had to reject basically this strategy for std::map.

  3. We may not need to store both begin and end for a general random access iterator. For example, if we ignored the iterator invalidation problem above, std::map only needs (iter, end). begin is only used to stop an errant operator--, and we can already detect that from the tree structure itself. (The only reason we can't detect end is that libc++ is extra clever and doesn't even have parent and right pointers on the end node at all. So attempting to read them will immediately go OOB.)

So my inclination would be to wait to see how we decide to solve the problem for non-contiguous random access and then make a decision about whether to generalize __bounded_iter then. My guess would be that we wouldn't end up wanting to reuse it anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random access iterators are required to have O(1) complexity for distance and other operations like it + n, so std::map::iterator is bidirectional, but not random-access. I get your point about maybe not wanting to implement __bounded_iter the same all the time, but I do think that the properties we use that hold for contiguous iterators also hold for random access iterators. Needs to be confirmed though.

Copy link
Contributor Author

@davidben davidben Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I hadn't realized random access iterators were not the std::map ones! (I don't actually know the STL very well at the level of iterator categories.). Yeah, maybe it would be useful to generalize it? I guess we can see when we get to hardening some random-access, non-contiguous iterator. Since __bounded_iter isn't public API or anything, we only need to care about what libc++'s own containers do.

Either way, as you say, unrelated to this patch. 😁

@var-const
Copy link
Member

Nit: in the description, there seems to be an unfinished sentence:

At least for the common case of iterating over the entire [...]

Copy link
Member

@var-const var-const left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like this direction. I'm sorry it's taken me a long time to get to reviewing this patch, but I wanted to give it the attention it deserves. I haven't examined the Godbolt yet, but I wanted to share some initial thoughts in-between.

  1. Consider the following (bad) code:
void foo(std::span<int> s) {
  {
    auto i = s.begin() + get_offset();
    if (i >= s.end()) return;
    // ...
  }
  {
    auto j = s.begin() + get_offset();
    if (j > s.end()) j = s.end();
    // ...
  }
}

I think this is technically UB, but it's likely to be "benign" UB (I call these hobbit iterators because they go there and back again). The current implementation wouldn't reject this case, but with this patch, it will start trapping if get_offset returns a value beyond bounds. I think part of the reason __bounded_iter is currently implemented this way is a concern that projects out in the wild might inadvertently rely on "hobbit" iterators, and rejecting them would cause unnecessary friction on adoption. I don't have any actual data on this -- this might be a largely hypothetical concern. But I want to make sure that if we want to break code like this, we do so consciously.

  1. In general, while it's very important for this type to be performant, I'm a bit concerned if we start to tie the implementation "too much" to the optimizer. I want to be pragmatic here, so obviously some level of attention to the optimizer is necessary. What I'd like to avoid is relying on code that has subtle implicit interactions with the optimizer. For one, I'm concerned that we might end up being tied to the current implementation of the optimizer (and its quirks) and do things that further down the line become unnecessary or even harmful as the optimizer evolves (we also need to take other compilers into account, not only Clang, even though Clang is the more important one). It's also potentially brittle, and we don't currently have any sort of automated benchmarks to catch performance regressions. Specifically, I'm concerned about two things mentioned in this patch:
  • an internal assertion that happened to improve performance by providing additional information to the optimizer. IMO, from a higher-level (logical?) perspective, it never makes sense that having an additional check could improve performance (at best, it could have a negligible negative impact). If we ever want to pass some information to the optimizer, we need to write it in a way that makes it explicit (something similar in spirit to _LIBCPP_ASSUME). It might expand to anything, including an assertion, under the hood, but the API must make it clear that the intention is to help the optimizer, not e.g. abort the program if the condition doesn't hold.
  • the != comparison optimizing better than <= comparison. I don't know if this is a fundamental property that the former optimizes better or it just happens to be so with the current implementation of the optimizer. In either case, the two seem to be largely interchangeable from the logical perspective. We need to be pragmatic here, so we likely need to rely on this, but I just want to see if we can find some way to express this more explicitly (comments definitely help, but perhaps we could find a way to go beyond that).

I want to spend a little more time on the patch just to make sure I don't miss anything, but so far I don't see any significant issues with this approach.

// within a [begin, end] range and carries these bounds with it. The iterator
// ensures that it is pointing within [begin, end) range when it is
// dereferenced. It also ensures that it is never iterated outside of
// [begin, end].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to expand on this a little here?

  1. Do we want to mention that iterating outside these bounds is potentially undefined behavior? For pointers, it's stated quite explicitly in [expr.add#4]:

Otherwise, if P points to an array element i of an array object x with n elements [...], the expressions P + J and J + P [...] point to the (possibly-hypothetical) array element i+j of x if 0 ≤ i+j ≤n [...] Otherwise, the behavior is undefined.

For iterators, it's a little bit less explicit, but I think the best reference is the table of Cpp17InputIterator requirements which states that the result of incrementing an iterator must be either dereferenceable or point to the past-the-end element (then addition is defined in terms of repeated increment). I can't immediately find the equivalent requirement on input_iterator, but I presume it must be somewhere (or if it's not, I'd expect it to be an omission).

(Note: I say "potentially" because if we have int a[32]; __bounded_iter i(a, a + 5);, IIUC calling i + 20 is not UB since it's still within the bounds of the actual underlying array, but the bounded iterator has no way of knowing that so it would still assert)

  1. Do we want to add any discussion on why it's desirable to disallow creating invalid iterators, even if they are never dereferenced? I don't feel very strongly because git blame can always bring a curious reader to this patch with all the discussion and links, but perhaps a brief rationale would still be helpful.

Copy link
Member

@ldionne ldionne Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is something like this library-level UB?

int array[32];
std::span<int> span(array);
std::span<int> sub = span.subspan(0, 16);
// is this library UB?
auto it = sub.begin() + 20;
*it;

This is definitely not language-level UB because we're still within the int[32], but I think this is library-level UB because as far as sub is concerned, it is beyond sub.end(). If this is not library UB, then there's a real potential concern with the overall notion of using bounded iterators with span cause it will definitely find cases of benign UB in existing code. I am pretty certain we confirmed this was library UB a long time ago but we should confirm again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to expand on this a little here?

Done.

Do we want to add any discussion on why it's desirable to disallow creating invalid iterators,

Also done.

Is something like this library-level UB?

That is a good question! I think it is indeed library-level UB, per the table that @var-const pointed at, but I didn't scour the spec carefully.

Alas, this means that bounded iterators on std::span and std::string_view do indeed introduce crashes that weren't strictly necessary. In fact, @danakj and I just ran into an analogous issue with std::string_view iterators in Chromium and have to go back and rewrite it! 😁 However, I cannot think of any reasonable way to bounds-check iterators without paying that one, so I think we'll just have to run with it and hope it's not too bad.

Either way, I think this is unrelated to this PR. The existing bounded iterators design already breaks that code. (Which is how we managed to run into it.) This PR does newly break cases where you compute sub.begin() + 20 without dereferencing it, but I think almost all code that relies on interchanging subspan or substr iterators is also going to dereference it, so that delta isn't likely to be meaningful.

@@ -151,6 +162,10 @@ struct __bounded_iter {
}

_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator-=(difference_type __n) _NOEXCEPT {
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be any value in implementing this in terms of operator+=? Or would that make the call stack in case of a failed assertion too confusing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that requires negating __n, which may hit UB in itself. :-(

};

template <typename Iter>
void test_iterator(Iter begin, Iter end, bool reverse) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors produced by a reverse iterator can be a bit confusing for the user. @ldionne Do you think this is worth the trouble to try to fix? I suppose we could implement a separate checked reverse iterator type, but perhaps that's overkill.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think the test is fine as-is, those are implementation specific anyway, so this is good enough (IMO).

@davidben
Copy link
Contributor Author

davidben commented Feb 16, 2024

Nit: in the description, there seems to be an unfinished sentence: [...]

Thanks! Rephrased.

Consider the following (bad) code: [...]
I think this is technically UB, but it's likely to be "benign" UB (I call these hobbit iterators because they go there and back again). The current implementation wouldn't reject this case, but with this patch, it will start trapping if get_offset returns a value beyond bounds.

I actually disagree that this is "benign" UB. When this function is inlined at a point where the compiler can see the bounds of the allocation, the out-of-bounds pointer arithmetic is very visible to the compiler. The consequence of this analysis is that the compiler learns bounds on the pointers, which would then allow it to delete the bounds-checks we wanted to add.

Moreover, by assuming this doesn't happen, we put ourselves in a very precarious position. We want the optimizer to delete redundant bounds checks, and we want to avoid, as you say, tying the implementation too much to one particular optimizer. Those pressures mean we will want optimizers to get increasingly good at bounds analysis. But if we exhibit this UB, we will simultaneously want them to not get too good at it. I don't think that's a good place to be in.

Indeed, see #78784. That is the missing observation that would have allowed Clang to optimize the current __bounded_iter. That observation relies on pointer arithmetic UB.

I would also argue that the additional checks are actually more optimizer-agnostic than the current ones, but I'll discuss that below.

I think part of the reason __bounded_iter is currently implemented this way is a concern that projects out in the wild might inadvertently rely on "hobbit" iterators, and rejecting them would cause unnecessary friction on adoption. I don't have any actual data on this -- this might be a largely hypothetical concern. But I want to make sure that if we want to break code like this, we do so consciously.

Absolutely. I think should consciously break code like this:

  1. I expect hobbit iterators are pretty rare, much rarer than hobbit pointers.
  2. I really don't think this UB is benign (see above), so hobbit iterators and pointers are dubiously compatible with bounds-checking.
  3. Not only is the new __bounded_iter better optimized in practice, it is better optimized because it aligns better with C++ iterator patterns. That is, it is inherently easier to optimize, not specific to one optimizer (see below).

I think those three are strong evidence that this change is worth the risk of impacting hobbit iterators.

In general, while it's very important for this type to be performant, I'm a bit concerned if we start to tie the implementation "too much" to the optimizer. I want to be pragmatic here, so obviously some level of attention to the optimizer is necessary. What I'd like to avoid is relying on code that has subtle implicit interactions with the optimizer.

Could you elaborate a bit more on this concern? To me, assertions that are more straightforwardly deletable are less tied to the optimizer, because it's more likely to be universal. Whereas assertions that rely on more complex reasoning that our optimizer happen to handle would be more tied to it. This PR improves the situation by this criteria!

an internal assertion that happened to improve performance by providing additional information to the optimizer.

Hmm, I'm not quite following this one. Could you elaborate? I suspect we may not be on the same page about why this new version is easier to optimize, so let me continue on and perhaps that'll help?

IMO, from a higher-level (logical?) perspective, it never makes sense that having an additional check could improve performance (at best, it could have a negligible negative impact).

I would push back a bit on this. I don't think this higher-level intuition is correct. The additional checks move the assertions closer to their preconditions. That is inherently easier for any optimizer to handle.

If we ever want to pass some information to the optimizer, we need to write it in a way that makes it explicit (something similar in spirit to _LIBCPP_ASSUME). It might expand to anything, including an assertion, under the hood, but the API must make it clear that the intention is to help the optimizer, not e.g. abort the program if the condition doesn't hold.

So, as discussed above, we specifically want to abort the program because otherwise we'll trigger non-benign UB and get ourselves into problems. But also _LIBCPP_ASSUME is currently a no-op because of an LLVM bug. :-) I do think we need a _LIBCPP_ACTUALLY_ASSUME to make other parts of bounded iterators

the != comparison optimizing better than <= comparison. I don't know if this is a fundamental property that the former optimizes better or it just happens to be so with the current implementation of the optimizer.

(The comparison is actually <, not <=. We want to reject iter == end.)

This is a fundamental property, both of how C++ is used and just what it means to optimize a check out. In order to delete a check, be it != or <, the compiler must prove it always holds. Providing a != b is inherently easier than proving a < b because it's a weaker statement to prove.

Moreover, the way C++ iterators work, != is the natural theorem to prove. When you use iterators, in a range-for loop or elsewhere, the calling code writes iter != end, not iter < end. Think about what a ranged-for loop expands to:

auto it = c.begin();
while (it != c.end()) {
  f(*it);
  ++it;
}

In the current __bounded_iter, this becomes:

auto it = c.begin();
while (it != c.end()) {
  assert(c.begin() <= it && it < c.end());
  f(*it);
  ++it;
}

This is not an easy theorem to prove. In order to delete the check, the compiler must know that c.begin() <= c.end() and it doesn't know that right now. In order to know that... it must exploit pointer arithmetic UB (c.end() is data + size. Again, see #78784. I keep pointing at that one for a reason), the very pointer arithmetic UB that the current __bounded_iter incorrectly assumes is benign!

Now consider the new __bounded_iter:

auto it = c.begin();
while (it != c.end()) {
  assert(it != c.end());
  f(*it);
  assert(it != c.end());
  ++it;
}

This is trivial to optimize. The compiler does not need any complex analysis at all, because we've the assertions exactly match the preconditions that the caller is already expected to use.

Now, in more complicated cases, we need the compiler to infer < too. That is, however, harder for the compiler and requires it know something about the relationship between begin and end. I think we should give it that relationship, with perhaps with _LIBCPP_ACTUALLY_ASSUME. But that does not change the fact that our iterators should work with !=, for all the reasons discussed here. That is, this PR makes sense independent of any future smarter optimization work.

@davidben
Copy link
Contributor Author

Aha, found it! I thought I'd written all the above already before. See also #78771 (comment), which discussed this already, but probably got lost in a random other issue.

A lot of existing files were not formatted correctly...
Copy link
Contributor

@hawkinsw hawkinsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to comment only with nits. Trying to be helpful despite a lack of technical expertise in the area. Thank you for doing this hard work!

@var-const
Copy link
Member

Thank you for writing up the detailed rationale! I'm convinced, and I think it's very valuable to have this decision written for potential future reference.

...by assuming this doesn't happen, we put ourselves in a very precarious position. We want the optimizer to delete redundant bounds checks, and we want to avoid, as you say, tying the implementation too much to one particular optimizer. Those pressures mean we will want optimizers to get increasingly good at bounds analysis. But if we exhibit this UB, we will simultaneously want them to not get too good at it. I don't think that's a good place to be in.

I think this is a very good way to put it.

Providing a != b is inherently easier than proving a < b because it's a weaker statement to prove.

Putting this and the part I quoted above, it seems to come down to "Ensuring that bounded iterators are always valid (point to a valid array element or to the one-past-the-end element) is a stronger invariant, and thus can be expected to optimize better and should never optimize worse". This seems like a good intuitive rationale to me (but please correct me if I'm simplifying things too much here).

Could you elaborate a bit more on this concern? To me, assertions that are more straightforwardly deletable are less tied to the optimizer, because it's more likely to be universal. Whereas assertions that rely on more complex reasoning that our optimizer happen to handle would be more tied to it. This PR improves the situation by this criteria!

There was a comment (sorry, I can't immediately find it, might have been a different PR or issue) that mentioned that the following assertion in the constructor of the bounded iterator:

_LIBCPP_ASSERT_INTERNAL(__begin <= __end, "__bounded_iter(current, begin, end): [begin, end) is not a valid range");

used to be enabled in the fast mode and lead to better codegen (because it established an invariant that the optimizer could rely on). My concern was prompted by this specific part, but it's more general (and, to be clear, it's hypothetical -- this is a potential situation that I hope we won't find ourselves in, not a comment on the changes in this patch).

I would push back a bit on this. I don't think this higher-level intuition is correct. The additional checks move the assertions closer to their preconditions. That is inherently easier for any optimizer to handle.

What I'm trying to express is that, from the language specification perspective, I think it's fair to say the optimizer doesn't exist. Or put another way, the Standard deals with the abstract machine, and IIRC, the abstract machine doesn't really have a notion of an optimizer, or at least not in any well-defined, structured way. Of course, the abstract machine is a big simplification that doesn't also have a notion of many other vital things -- this is in no way to say that the optimizer isn't important. But I think it provides a useful level of abstraction, and at that level of abstraction it makes sense to talk about the overhead or performance of individual operations, but I think it's fair to say there is no way one operation can indirectly affect another.

In that regard, it makes complete sense to me to say "between these two possible ways of writing a certain assertion, this one would have better performance". This analysis might break the abstraction a little bit, but at the end of the day we know that the optimizer is there, so I think that's quite acceptable. Fundamentally, we're still within the model where we analyze the effect (including the performance effect) of standalone independent statements. However, I feel that a statement like "Adding a condition statement here has an indirect effect on the following statement and makes it more efficient" is simply incompatible with this model. This is the source of my concern.

At the same time, of course we have to be pragmatic and performance is crucial here. I just feel that if we need to start relying on indirect effects of some statements (not in this patch, but it seems likely that this will happen in the future) -- "indirect" from the perspective of the abstract machine -- we need to find some way to formalize it and make it visible to the reader. When I read an assertion, I presume it literally means "Check the condition and halt the program if it doesn't hold". In reality, it might also provide crucial information to the optimizer, but that's an indirect effect that's invisible at the source code level. Comments can help a lot, but if we can think of a way to make the actual intent visible in the code, that would be great.

Once again, this is just something that I think we should be mindful of as we go, not an immediate concern with this patch.

@var-const
Copy link
Member

LGTM; thanks a lot for working on this! (FWIW, this is something I wanted to do for a long time, and I'm very happy to see the detailed analysis in this patch that shows that this approach indeed results in better performance)

There is one thing that would be awesome to check, but I don't want to block the patch on this. It would be great to see how well this patch plays with GCC. To be clear, for our purposes Clang is the primary compiler. However, the comparison might be valuable to double-check our reasoning about the "optimizability". If we notice something that optimizes better on Clang but worse on GCC, it might be a GCC issue (which we don't need to worry about), but it might also indicate that we rely too much on the specifics of the Clang optimizer. So I don't think it makes sense to do a very thorough analysis, but a quick "smoke check" might be interesting.

There is a way to use our implementation with GCC on Godbolt (I learned this from @philnik777) -- essentially you can use the -### flag to get the exact Clang invocation that would contain the absolute paths to Godbolt's libc++ installation, then provide those paths as -isystem in tandem with -nostdinc++. I think this link should work. From a quick glance, it seems that sum1 and sum2 completely optimize away just like on Clang, sum3 optimizes better but still not entirely, deref_end is equally good, add_and_deref becomes worse (probably same as on Clang); find and find2 would require some analysis, and copy_span seems to become much worse (whereas the previous implementation was somehow completely optimized away).

Once again, I absolutely don't think we should spend a lot of time on this, but it might be valuable if you could take a quick look in case anything stands out.

@davidben
Copy link
Contributor Author

On phone so haven't looked carefully, but copy_span is a bit misleading of a baseline. I actually included it to demonstrate a different problem: we're not supposed to optimize everything out! For bounds safety, there should be a check that the destination is at least as large as the source. But the unwrapping mechanism in libc++ ends up bypassing the safety check before the compiler even sees it. This is #78771.

With this PR, the rewrap step ends up restoring the check, sort of. It happens after the copy instead of before, which is too late. I was thinking we'd fix that by first doing this, then adding some iterator arithmetic in front to get the check in the right order. But I haven't looked very carefully yet at the codegen just because it's still not doing the right thing anyway.

@ldionne
Copy link
Member

ldionne commented Mar 7, 2024

@davidben Are we waiting for something specific before we merge this? I think we'd be fine with merging this as-is and making any required improvements as follow-ups, since I don't think the performance characteristics of GCC would realistically change the outcome of this patch.

@davidben
Copy link
Contributor Author

davidben commented Mar 7, 2024

@ldionne Oh I was going to ask you all the same thing. :-) Yeah, nothing on my end. As far as I'm concerned, this is all ready to merge! I left a comment above for why copy_span is kind of a red herring.

@ldionne
Copy link
Member

ldionne commented Mar 7, 2024

I just updated it on to the latest main to trigger the CI. Let's merge this when this is green.

@var-const
Copy link
Member

The CI failure looks unrelated.

@var-const
Copy link
Member

The one CI failure is just an infrastructure issue, the test suite ran fine. I'll merge now -- thanks a lot for the patch!

@var-const var-const merged commit a83f8e0 into llvm:main Mar 12, 2024
var-const pushed a commit that referenced this pull request Jul 23, 2024
…ing (#78929)

~~NB: This PR depends on #78876. Ignore the first commit when reviewing,
and don't merge it until #78876 is resolved. When/if #78876 lands, I'll
clean this up.~~

This partially restores parity with the old, since removed debug build.
We now can re-enable a bunch of the disabled tests. Some things of note:

- `bounded_iter`'s converting constructor has never worked. It needs a
friend declaration to access the other `bound_iter` instantiation's
private fields.

- The old debug iterators also checked that callers did not try to
compare iterators from different objects. `bounded_iter` does not
currently do this, so I've left those disabled. However, I think we
probably should add those. See
#78771 (comment)

- The `std::vector` iterators are bounded up to capacity, not size. This
makes for a weaker safety check. This is because the STL promises not to
invalidate iterators when appending up to the capacity. Since we cannot
retroactively update all the iterators on `push_back()`, I've instead
sized it to the capacity. This is not as good, but at least will stop
the iterator from going off the end of the buffer.

There was also no test for this, so I've added one in the `std`
directory.

- `std::string` has two ambiguities to deal with. First, I opted not to
size it against the capacity. https://eel.is/c++draft/string.require#4
says iterators are invalidated on an non-const operation. Second,
whether the iterator can reach the NUL terminator. The previous debug
tests and the special-case in https://eel.is/c++draft/string.access#2
suggest no. If either of these causes widespread problems, I figure we
can revisit.

- `resize_and_overwrite.pass.cpp` assumed `std::string`'s iterator
supported `s.begin().base()`, but I see no promise of this in the
standard. GCC also doesn't support this. I fixed the test to use
`std::to_address`.

- `alignof.compile.pass.cpp`'s pointer isn't enough of a real pointer.
(It needs to satisfy `NullablePointer`, `LegacyRandomAccessIterator`,
and `LegacyContiguousIterator`.) `__bounded_iter` seems to instantiate
enough to notice. I've added a few more bits to satisfy it.

Fixes #78805
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…ing (#78929)

~~NB: This PR depends on #78876. Ignore the first commit when reviewing,
and don't merge it until #78876 is resolved. When/if #78876 lands, I'll
clean this up.~~

This partially restores parity with the old, since removed debug build.
We now can re-enable a bunch of the disabled tests. Some things of note:

- `bounded_iter`'s converting constructor has never worked. It needs a
friend declaration to access the other `bound_iter` instantiation's
private fields.

- The old debug iterators also checked that callers did not try to
compare iterators from different objects. `bounded_iter` does not
currently do this, so I've left those disabled. However, I think we
probably should add those. See
#78771 (comment)

- The `std::vector` iterators are bounded up to capacity, not size. This
makes for a weaker safety check. This is because the STL promises not to
invalidate iterators when appending up to the capacity. Since we cannot
retroactively update all the iterators on `push_back()`, I've instead
sized it to the capacity. This is not as good, but at least will stop
the iterator from going off the end of the buffer.

There was also no test for this, so I've added one in the `std`
directory.

- `std::string` has two ambiguities to deal with. First, I opted not to
size it against the capacity. https://eel.is/c++draft/string.require#4
says iterators are invalidated on an non-const operation. Second,
whether the iterator can reach the NUL terminator. The previous debug
tests and the special-case in https://eel.is/c++draft/string.access#2
suggest no. If either of these causes widespread problems, I figure we
can revisit.

- `resize_and_overwrite.pass.cpp` assumed `std::string`'s iterator
supported `s.begin().base()`, but I see no promise of this in the
standard. GCC also doesn't support this. I fixed the test to use
`std::to_address`.

- `alignof.compile.pass.cpp`'s pointer isn't enough of a real pointer.
(It needs to satisfy `NullablePointer`, `LegacyRandomAccessIterator`,
and `LegacyContiguousIterator`.) `__bounded_iter` seems to instantiate
enough to notice. I've added a few more bits to satisfy it.

Fixes #78805
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardening Issues related to the hardening effort libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ranged-for loop with _LIBCPP_ABI_BOUNDED_ITERATORS does not optimize runtime checks
5 participants