-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[libc++][hardening] Use bounded iterators in std::vector and std::string #78929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-libcxx Author: David Benjamin (davidben) ChangesNB: This PR depends on #78876. Ignore the first commit when reviewing, and don't merge it until #78876 is resolved. When/if #78876 lands, I'll clean this up. This partially restores parity with the old, since removed debug build. We now can re-enable a bunch of the disabled tests. Some things of note:
Fixes #78805 Patch is 51.15 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78929.diff 20 Files Affected:
diff --git a/libcxx/include/__iterator/bounded_iter.h b/libcxx/include/__iterator/bounded_iter.h
index 2a667648871c4c1..91ebe39d1ba57b6 100644
--- a/libcxx/include/__iterator/bounded_iter.h
+++ b/libcxx/include/__iterator/bounded_iter.h
@@ -31,13 +31,10 @@ _LIBCPP_BEGIN_NAMESPACE_STD
// Iterator wrapper that carries the valid range it is allowed to access.
//
// This is a simple iterator wrapper for contiguous iterators that points
-// within a [begin, end) range and carries these bounds with it. The iterator
-// ensures that it is pointing within that [begin, end) range when it is
-// dereferenced.
-//
-// Arithmetic operations are allowed and the bounds of the resulting iterator
-// are not checked. Hence, it is possible to create an iterator pointing outside
-// its range, but it is not possible to dereference it.
+// within a [begin, end] range and carries these bounds with it. The iterator
+// ensures that it is pointing within [begin, end) range when it is
+// dereferenced. It also ensures that it is never iterated outside of
+// [begin, end].
template <class _Iterator, class = __enable_if_t< __libcpp_is_contiguous_iterator<_Iterator>::value > >
struct __bounded_iter {
using value_type = typename iterator_traits<_Iterator>::value_type;
@@ -70,18 +67,20 @@ struct __bounded_iter {
private:
// Create an iterator wrapping the given iterator, and whose bounds are described
- // by the provided [begin, end) range.
+ // by the provided [begin, end] range.
//
- // This constructor does not check whether the resulting iterator is within its bounds.
- // However, it does check that the provided [begin, end) range is a valid range (that
- // is, begin <= end).
+ // Except in debug builds, the constructor does not check whether the resulting iterator
+ // is within its bounds. The container is assumed to be self-consistent.
//
// Since it is non-standard for iterators to have this constructor, __bounded_iter must
// be created via `std::__make_bounded_iter`.
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 explicit __bounded_iter(
_Iterator __current, _Iterator __begin, _Iterator __end)
: __current_(__current), __begin_(__begin), __end_(__end) {
- _LIBCPP_ASSERT_INTERNAL(__begin <= __end, "__bounded_iter(current, begin, end): [begin, end) is not a valid range");
+ _LIBCPP_ASSERT_INTERNAL(
+ __begin <= __current, "__bounded_iter(current, begin, end): current and begin are inconsistent");
+ _LIBCPP_ASSERT_INTERNAL(
+ __current <= __end, "__bounded_iter(current, begin, end): current and end are inconsistent");
}
template <class _It>
@@ -91,21 +90,25 @@ struct __bounded_iter {
// Dereference and indexing operations.
//
// These operations check that the iterator is dereferenceable, that is within [begin, end).
+ // The iterator is always within [begin, end], so we only need to check for end. This is
+ // easier to optimize out. See https://github.com/llvm/llvm-project/issues/78829
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator*() const _NOEXCEPT {
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
- __in_bounds(__current_), "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator");
+ __current_ != __end_, "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
return *__current_;
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 pointer operator->() const _NOEXCEPT {
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
- __in_bounds(__current_), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator");
+ __current_ != __end_, "__bounded_iter::operator->: Attempt to dereference an iterator at the end");
return std::__to_address(__current_);
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator[](difference_type __n) const _NOEXCEPT {
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
- __in_bounds(__current_ + __n), "__bounded_iter::operator[]: Attempt to index an iterator out-of-range");
+ __n >= __begin_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator past the start");
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n < __end_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator at or past the end");
return __current_[__n];
}
@@ -114,6 +117,8 @@ struct __bounded_iter {
// These operations do not check that the resulting iterator is within the bounds, since that
// would make it impossible to create a past-the-end iterator.
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator++() _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __current_ != __end_, "__bounded_iter::operator++: Attempt to advance an iterator past the end");
++__current_;
return *this;
}
@@ -124,6 +129,8 @@ struct __bounded_iter {
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator--() _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __current_ != __begin_, "__bounded_iter::operator--: Attempt to rewind an iterator past the start");
--__current_;
return *this;
}
@@ -134,6 +141,10 @@ struct __bounded_iter {
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator+=(difference_type __n) _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n >= __begin_ - __current_, "__bounded_iter::operator+=: Attempt to rewind an iterator past the start");
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n <= __end_ - __current_, "__bounded_iter::operator+=: Attempt to advance an iterator past the end");
__current_ += __n;
return *this;
}
@@ -151,6 +162,10 @@ struct __bounded_iter {
}
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator-=(difference_type __n) _NOEXCEPT {
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n <= __current_ - __begin_, "__bounded_iter::operator-=: Attempt to rewind an iterator past the start");
+ _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+ __n >= __current_ - __end_, "__bounded_iter::operator-=: Attempt to advance an iterator past the end");
__current_ -= __n;
return *this;
}
@@ -197,15 +212,12 @@ struct __bounded_iter {
}
private:
- // Return whether the given iterator is in the bounds of this __bounded_iter.
- _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __in_bounds(_Iterator const& __iter) const {
- return __iter >= __begin_ && __iter < __end_;
- }
-
template <class>
friend struct pointer_traits;
+ template <class, class>
+ friend struct __bounded_iter;
_Iterator __current_; // current iterator
- _Iterator __begin_, __end_; // valid range represented as [begin, end)
+ _Iterator __begin_, __end_; // valid range represented as [begin, end]
};
template <class _It>
diff --git a/libcxx/include/string b/libcxx/include/string
index e97139206d4fa7c..a3a04d42e18adb2 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -579,6 +579,7 @@ basic_string<char32_t> operator""s( const char32_t *str, size_t len );
#include <__functional/unary_function.h>
#include <__fwd/string.h>
#include <__ios/fpos.h>
+#include <__iterator/bounded_iter.h>
#include <__iterator/distance.h>
#include <__iterator/iterator_traits.h>
#include <__iterator/reverse_iterator.h>
@@ -736,9 +737,16 @@ public:
"[allocator.requirements] states that rebinding an allocator to the same type should result in the "
"original allocator");
- // TODO: Implement iterator bounds checking without requiring the global database.
+#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS
+ // The pointer must be passed through __wrap_iter because
+ // __alloc_traits::pointer may not be detected as a continguous iterator on
+ // its own.
+ typedef __bounded_iter<__wrap_iter<pointer>> iterator;
+ typedef __bounded_iter<__wrap_iter<const_pointer>> const_iterator;
+#else
typedef __wrap_iter<pointer> iterator;
typedef __wrap_iter<const_pointer> const_iterator;
+#endif
typedef std::reverse_iterator<iterator> reverse_iterator;
typedef std::reverse_iterator<const_iterator> const_reverse_iterator;
@@ -868,11 +876,27 @@ private:
__init_with_sentinel(std::move(__first), std::move(__last));
}
+#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS
+ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 iterator __make_iterator(pointer __p) {
+ return __make_bounded_iter(
+ __wrap_iter<pointer>(__p),
+ __wrap_iter<pointer>(__get_pointer()),
+ __wrap_iter<pointer>(__get_pointer() + size()));
+ }
+
+ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 const_iterator __make_const_iterator(const_pointer __p) const {
+ return __make_bounded_iter(
+ __wrap_iter<const_pointer>(__p),
+ __wrap_iter<const_pointer>(__get_pointer()),
+ __wrap_iter<const_pointer>(__get_pointer() + size()));
+ }
+#else
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 iterator __make_iterator(pointer __p) { return iterator(__p); }
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 const_iterator __make_const_iterator(const_pointer __p) const {
return const_iterator(__p);
}
+#endif
public:
_LIBCPP_TEMPLATE_DATA_VIS static const size_type npos = -1;
diff --git a/libcxx/include/vector b/libcxx/include/vector
index e9dd57055cb112d..7ce8f957d7d7cb5 100644
--- a/libcxx/include/vector
+++ b/libcxx/include/vector
@@ -326,6 +326,7 @@ template<class T, class charT> requires is-vector-bool-reference<T> // Since C++
#include <__functional/hash.h>
#include <__functional/unary_function.h>
#include <__iterator/advance.h>
+#include <__iterator/bounded_iter.h>
#include <__iterator/distance.h>
#include <__iterator/iterator_traits.h>
#include <__iterator/reverse_iterator.h>
@@ -399,9 +400,16 @@ public:
typedef typename __alloc_traits::difference_type difference_type;
typedef typename __alloc_traits::pointer pointer;
typedef typename __alloc_traits::const_pointer const_pointer;
- // TODO: Implement iterator bounds checking without requiring the global database.
+#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS
+ // The pointer must be passed through __wrap_iter because
+ // __alloc_traits::pointer may not be detected as a continguous iterator on
+ // its own.
+ typedef __bounded_iter<__wrap_iter<pointer>> iterator;
+ typedef __bounded_iter<__wrap_iter<const_pointer>> const_iterator;
+#else
typedef __wrap_iter<pointer> iterator;
typedef __wrap_iter<const_pointer> const_iterator;
+#endif
typedef std::reverse_iterator<iterator> reverse_iterator;
typedef std::reverse_iterator<const_iterator> const_reverse_iterator;
@@ -796,10 +804,30 @@ private:
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void __append(size_type __n);
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void __append(size_type __n, const_reference __x);
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator __make_iter(pointer __p) _NOEXCEPT {
+#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS
+ // Bound the iterator according to the capacity, rather than the size.
+ // Resizing a vector up to the capacity will not invalidate iterators, so,
+ // Without a way to update all live iterators on resize, we must
+ // conservatively bound the iterator by the capacity rather than the size.
+ return __make_bounded_iter(
+ __wrap_iter<pointer>(__p), __wrap_iter<pointer>(this->__begin_), __wrap_iter<pointer>(this->__end_cap()));
+#else
return iterator(__p);
+#endif
}
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI const_iterator __make_iter(const_pointer __p) const _NOEXCEPT {
+#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS
+ // Bound the iterator according to the capacity, rather than the size.
+ // Resizing a vector up to the capacity will not invalidate iterators, so,
+ // Without a way to update all live iterators on resize, we must
+ // conservatively bound the iterator by the capacity rather than the size.
+ return __make_bounded_iter(
+ __wrap_iter<const_pointer>(__p),
+ __wrap_iter<const_pointer>(this->__begin_),
+ __wrap_iter<const_pointer>(this->__end_cap()));
+#else
return const_iterator(__p);
+#endif
}
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void
__swap_out_circular_buffer(__split_buffer<value_type, allocator_type&>& __v);
diff --git a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.add.pass.cpp b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.add.pass.cpp
index 42021824ce6aec5..1ccc71c6c1512b3 100644
--- a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.add.pass.cpp
+++ b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.add.pass.cpp
@@ -10,8 +10,8 @@
// Add to iterator out of bounds.
-// REQUIRES: has-unix-headers
-// UNSUPPORTED: !libcpp-has-legacy-debug-mode, c++03
+// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
+// UNSUPPORTED: libcpp-hardening-mode=none, c++03
#include <vector>
#include <cassert>
@@ -19,27 +19,39 @@
#include "check_assertion.h"
#include "min_allocator.h"
+template <typename T, typename A>
+void fill_to_capacity(std::vector<T, A>& vec) {
+ // Fill vec up to its capacity. Our bounded iterators currently unable to
+ // catch accesses between size and capacity due to iterator stability
+ // guarantees. This function clears those effects.
+ while (vec.size() < vec.capacity()) {
+ vec.push_back(T());
+ }
+}
+
int main(int, char**) {
{
typedef int T;
typedef std::vector<T> C;
C c(1);
+ fill_to_capacity(c);
C::iterator i = c.begin();
- i += 1;
+ i += c.size();
assert(i == c.end());
i = c.begin();
- TEST_LIBCPP_ASSERT_FAILURE(i + 2, "Attempted to add/subtract an iterator outside its valid range");
+ TEST_LIBCPP_ASSERT_FAILURE(i + 2, "__bounded_iter::operator+=: Attempt to advance an iterator past the end");
}
{
typedef int T;
typedef std::vector<T, min_allocator<T> > C;
C c(1);
+ fill_to_capacity(c);
C::iterator i = c.begin();
- i += 1;
+ i += c.size();
assert(i == c.end());
i = c.begin();
- TEST_LIBCPP_ASSERT_FAILURE(i + 2, "Attempted to add/subtract an iterator outside its valid range");
+ TEST_LIBCPP_ASSERT_FAILURE(i + 2, "__bounded_iter::operator+=: Attempt to advance an iterator past the end");
}
return 0;
diff --git a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.decrement.pass.cpp b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.decrement.pass.cpp
index d134527a967e58c..3f6092d9208f8f6 100644
--- a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.decrement.pass.cpp
+++ b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.decrement.pass.cpp
@@ -10,8 +10,8 @@
// Decrement iterator prior to begin.
-// REQUIRES: has-unix-headers
-// UNSUPPORTED: !libcpp-has-legacy-debug-mode, c++03
+// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
+// UNSUPPORTED: libcpp-hardening-mode=none, c++03
#include <vector>
#include <cassert>
@@ -27,7 +27,7 @@ int main(int, char**) {
C::iterator i = c.end();
--i;
assert(i == c.begin());
- TEST_LIBCPP_ASSERT_FAILURE(--i, "Attempted to decrement a non-decrementable iterator");
+ TEST_LIBCPP_ASSERT_FAILURE(--i, "__bounded_iter::operator--: Attempt to rewind an iterator past the start");
}
{
@@ -37,7 +37,7 @@ int main(int, char**) {
C::iterator i = c.end();
--i;
assert(i == c.begin());
- TEST_LIBCPP_ASSERT_FAILURE(--i, "Attempted to decrement a non-decrementable iterator");
+ TEST_LIBCPP_ASSERT_FAILURE(--i, "__bounded_iter::operator--: Attempt to rewind an iterator past the start");
}
return 0;
diff --git a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.dereference.pass.cpp b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.dereference.pass.cpp
index 918cdd74b7916c7..a6e652de154893a 100644
--- a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.dereference.pass.cpp
+++ b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.dereference.pass.cpp
@@ -10,29 +10,41 @@
// Dereference non-dereferenceable iterator.
-// REQUIRES: has-unix-headers
-// UNSUPPORTED: !libcpp-has-legacy-debug-mode, c++03
+// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
+// UNSUPPORTED: libcpp-hardening-mode=none, c++03
#include <vector>
#include "check_assertion.h"
#include "min_allocator.h"
+template <typename T, typename A>
+void fill_to_capacity(std::vector<T, A>& vec) {
+ // Fill vec up to its capacity. Our bounded iterators currently unable to
+ // catch accesses between size and capacity due to iterator stability
+ // guarantees. This function clears those effects.
+ while (vec.size() < vec.capacity()) {
+ vec.push_back(T());
+ }
+}
+
int main(int, char**) {
{
typedef int T;
typedef std::vector<T> C;
C c(1);
+ fill_to_capacity(c);
C::iterator i = c.end();
- TEST_LIBCPP_ASSERT_FAILURE(*i, "Attempted to dereference a non-dereferenceable iterator");
+ TEST_LIBCPP_ASSERT_FAILURE(*i, "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
}
{
typedef int T;
typedef std::vector<T, min_allocator<T> > C;
C c(1);
+ fill_to_capacity(c);
C::iterator i = c.end();
- TEST_LIBCPP_ASSERT_FAILURE(*i, "Attempted to dereference a non-dereferenceable iterator");
+ TEST_LIBCPP_ASSERT_FAILURE(*i, "__bounded_iter::operator*: Attempt to dereference an iterator at the end");
}
return 0;
diff --git a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.increment.pass.cpp b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.increment.pass.cpp
index d3e4b4ec3143f1b..b059e9607366767 100644
--- a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.increment.pass.cpp
+++ b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.increment.pass.cpp
@@ -10,8 +10,8 @@
// Increment iterator past end.
-// REQUIRES: has-unix-headers
-// UNSUPPORTED: !libcpp-has-legacy-debug-mode, c++03
+// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
+// UNSUPPORTED: libcpp-hardening-mode=none, c++03
#include <vector>
#include <cassert>
@@ -19,25 +19,37 @@
#include "check_assertion.h"
#include "min_allocator.h"
+template <typename T, typename A>
+void fill_to_capacity(std::vector<T, A>& vec) {
+ // Fill vec up to its capacity. Our bounded iterators currently unable to
+ // catch accesses between size and capacity due to iterator stability
+ // guarantees. This function clears those effects.
+ while (vec.size() < vec.capacity()) {
+ vec.push_back(T());
+ }
+}
+
int main(int, char**) {
{
typedef int T;
typedef std::vector<T> C;
C c(1);
+ fill_to_capacity(c);
C::iterator i = c.begin();
- ++i;
+ i += c.size();
assert(i == c.end());
- TEST_LIBCPP_ASSERT_FAILURE(++i, "Attempted to increment a non-incrementable iterator");
+ TEST_LIBCPP_ASSERT_FAILURE(++i, "__bounded_iter::operator++: Attempt to advance an iterator past the end");
}
{
typedef int T;
typedef std::vector<T, min_allocator<T> > C;
C c(1);
+ fill_to_capacity(c);
C::iterator i = c.begin();
- ++i;
+ i += c.size();
assert(i == c.end());
- TEST_LIBCPP_ASSERT_FAILURE(++i, "Attempted to increment a non-incrementable iterator");
+ TEST_LIBCPP_ASSERT_FAILURE(++i, "__bounded_iter::operator++: Attempt to advance an iterator past the end");
}
return 0;
diff --git a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.index.pass.cpp b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.index.pass.cpp
index 8e8f6a5dae69d9d..4f3b49b69b3e8d6 100644
--- a/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.index.pass.cpp
+++ b/libcxx/test/libcxx/containers/sequences/vector/debug.iterator.index.pass.cpp
@@ -10,8 +10,8 @@
// Index iterator out of bounds.
-// REQUIRES: has-unix-headers
-// UNSUPPORTED: !libcpp-has-legacy-debug-mode, c++03
+// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators
+// UNSUPPORTED: libcpp-hardening-mode=none, c++03
#include <vector>
#include <cassert>
@@ -19,23 +19,41 @@
#include "check_assertion.h"
#include "m...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
acb2ee0
to
9290806
Compare
Ran into an interesting case. While this PR doesn't trip it, I looked at adding iterator compatibility checks for #78771 (comment) and that tripped one of libc++'s existing This test is only valid if we take the position that iterators remain valid when appending up to capacity: llvm-project/libcxx/test/std/strings/basic.string/string.modifiers/string_insert/iter_char.pass.cpp Lines 31 to 32 in 380ac53
However, if I'm reading the spec correctly, I don't think that is guaranteed. Yet this test is under "std" and not "libcxx". Thoughts? Should we bound the string iterator to the capacity too? |
Another thing I realized: the trick of using the capacity will slightly impede optimizability of range-for loops. Because that will expand the code into:
In order for the compiler to delete those assertions, it needs to know that
|
libc++ turned off _LIBCPP_ASSUME because turning every debug assert into __builtin_assume tripped https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609 However, this means we can't use _LIBCPP_ASSUME when there is a clear optimization intent. See llvm#78929 (comment) for discussion of a place where _LIBCPP_ASSUME would be valuable. Go ahead and fix this now, and adjust the comments. I don't think we want _LIBCPP_ASSUME in disabled asserts anyway, at least not in any of the hardened modes. This PR should have no impact on libc++ right now, since _LIBCPP_ASSUME is currently never called standalone, but I figure I can do this as a standalone PR since it's pretty straightforward.
This partially restores parity with the old, since removed debug build. We now can re-enable a bunch of the disabled tests. Some things of note: - bounded_iter's converting constructor has never worked. It needs a friend declaration to access the other bound_iter instantiation's private fields. - The old debug iterators also checked that callers did not try to compare iterators from different objects. bounded_iter does not currently do this, so I've left those disabled. However, I think we probably should add those. See llvm#78771 (comment) - The std::vector iterators are bounded up to capacity, not size. This makes for a weaker safety check. This is because the STL promises not to invalidate iterators when appending up to the capacity. Since we cannot retroactively update all the iterators on push_back(), I've instead sized it to the capacaity. This is not as good, but at least will stop the iterator from going off the end of the buffer. There was also no test for this, so I've added one in the std directory. - std::string has two ambiguities to deal with. First, I opted not to size it against the capacity. https://eel.is/c++draft/string.require#4 says iterators are invalided on an non-const operation. Second, whether the iterator can reach the NUL terminator. The previous debug tests and the special-case in https://eel.is/c++draft/string.access#2 suggest no. If either of these causes widespread problems, I figure we can revisit. - resize_and_overwrite.pass.cpp assumed std::string's iterator supported s.begin().base(), but I see no promise of this in the standard. GCC also doesn't support this. I fixed the test to use std::to_address. - alignof.compile.pass.cpp's pointer isn't enough of a real pointer. (It needs to satisfy NullablePointer, LegacyRandomAccessIterator, and LegacyContiguousIterator.) __bounded_iter seems to instantiate enough to notice. I've added a few more bits to satisfy it. Fixes llvm#78805
9290806
to
2990385
Compare
This is now done. Two remaining open questions, other than generally needing review:
For (1), I'm leaning towards leaving it as-is and seeing it it causes problems. If I'm reading the spec right, we're not obligated to keep that working, and the tighter bounds will prevent more bugs. For (2), I propose we land #91801 and then, in @var-const, WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few comments but I think the approach is fine. If you are unable to update this before next Tuesday, please let me know and we will apply the review comments. It would be nice to get this in LLVM 19.
libcxx/include/string
Outdated
@@ -786,9 +787,16 @@ public: | |||
"[allocator.requirements] states that rebinding an allocator to the same type should result in the " | |||
"original allocator"); | |||
|
|||
// TODO: Implement iterator bounds checking without requiring the global database. | |||
#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS | |
#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS_IN_STRING |
I think we should introduce this under its own ABI macro, and the same for vector. And we should rename the existing macro to explicitly mention the containers they enable. I think that's the only way to pretend that these ABI macros yield a somewhat stable ABI.
In this patch, please introduce _LIBCPP_ABI_BOUNDED_ITERATORS_IN_STRING
and _LIBCPP_ABI_BOUNDED_ITERATORS_IN_VECTOR
and I can handle the related cleanup as a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (leaving the comment open as a reminder that we need to do some further cleanup).
libcxx/test/libcxx/containers/sequences/vector/debug.iterator.add.pass.cpp
Show resolved
Hide resolved
Alas the IETF meeting is next week so I probably won't have time for a bit. If you've got the cycles, go for it! Thanks! |
- create dedicated ABI macros `_LIBCPP_ABI_BOUNDED_ITERATORS_IN_STRING` and `_LIBCPP_ABI_BOUNDED_ITERATORS_IN_VECTOR` (and the associated Lit features, etc.); - add a release note and update the hardening documentation; - add a comment in `<string>` to mention that its bounded iterators use the size rather than the capacity; - reword and slightly expand the comment in `<vector>` explaining the use of the capacity rather than the size; - tweak the comment that justifies the use of `__wrap_iter`; - in `string`, move the conditional logic from outside to inside the functions' scope; - move the repeated `fill_to_capacity` function to a separate helper header and slightly reword the comment; - make `push_back.invalidation.pass.cpp` a death test and add a comment on some limitations of what a bounded iterator can currently detect; - fix another `small_pointer` helper class to fix compilation.
@davidben Awesome, thanks for the heads-up! I have applied most of the review feedback (plus some of the feedback I had). If you have the time, please take a look and let us know if there's anything you'd like to change/push back on; otherwise, I plan to merge by the Tuesday branch cutoff. |
Feedback all looks reasonable to me. Thanks! |
(I think everything I planned to do is uploaded now, I plan to merge before tomorrow's cutoff time if no objections) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidben Thank you so much for working on this! (And my sincere apologies for the very slow review -- I wanted to get to this for a while but unfortunately had my plate very full recently) LGTM.
My biggest concern starting this review was whether accepting this lightweight approach would prevent us from adding heavier iterators in the future that would retain a connection with the container and thus could also catch temporal safety issues, such as invalidation. That would bring the current incarnation of hardening on par with the historical debug mode. However, since we decided to create separate ABI macros for these new iterators, this is no longer an issue. While it's a tradeoff (having several independent ABI configuration options creates many possible combinations, increasing complexity), I believe it's probably an inevitable outcome -- for one, I don't think it's reasonable to force anyone using the current bounded-iterators
macro to start using bounded iterators in their vectors and strings, especially with no targeted opt-out possible, but also I think there might be legitimate contexts where someone might want bounded iterators in some, but not all, containers. The ability to choose (in the future, potentially) between more and less lightweight iterator hardening strategies follows quite nicely from this, kind of mirroring the distinction between the fast
and extensive
modes.
The idea to bound the vector iterator by capacity has grown on me. It is slightly unfortunate that there is a discrepancy between how checking is performed for strings versus vectors, but it seems like a necessary evil -- the only way to reconsolidate would be to make string checks laxer. Checking just the "physical" bounds prevents the worst vulnerabilities, so I believe we're still getting the most valuable part of the checks; I'd also note that IIUC, this is how Asan checks work for contiguous containers that are not specially annotated for the sanitizer.
I think that even without covering iterator invalidation (which, being a temporal safety issue, is much harder to solve in general anyway), this patch provides a noticeable improvement in the amount of checks we provide with little added code complexity, and the fact that the bound iterators are comparatively lightweight can be an advantage for some contexts.
…ing (#78929) ~~NB: This PR depends on #78876. Ignore the first commit when reviewing, and don't merge it until #78876 is resolved. When/if #78876 lands, I'll clean this up.~~ This partially restores parity with the old, since removed debug build. We now can re-enable a bunch of the disabled tests. Some things of note: - `bounded_iter`'s converting constructor has never worked. It needs a friend declaration to access the other `bound_iter` instantiation's private fields. - The old debug iterators also checked that callers did not try to compare iterators from different objects. `bounded_iter` does not currently do this, so I've left those disabled. However, I think we probably should add those. See #78771 (comment) - The `std::vector` iterators are bounded up to capacity, not size. This makes for a weaker safety check. This is because the STL promises not to invalidate iterators when appending up to the capacity. Since we cannot retroactively update all the iterators on `push_back()`, I've instead sized it to the capacity. This is not as good, but at least will stop the iterator from going off the end of the buffer. There was also no test for this, so I've added one in the `std` directory. - `std::string` has two ambiguities to deal with. First, I opted not to size it against the capacity. https://eel.is/c++draft/string.require#4 says iterators are invalidated on an non-const operation. Second, whether the iterator can reach the NUL terminator. The previous debug tests and the special-case in https://eel.is/c++draft/string.access#2 suggest no. If either of these causes widespread problems, I figure we can revisit. - `resize_and_overwrite.pass.cpp` assumed `std::string`'s iterator supported `s.begin().base()`, but I see no promise of this in the standard. GCC also doesn't support this. I fixed the test to use `std::to_address`. - `alignof.compile.pass.cpp`'s pointer isn't enough of a real pointer. (It needs to satisfy `NullablePointer`, `LegacyRandomAccessIterator`, and `LegacyContiguousIterator`.) `__bounded_iter` seems to instantiate enough to notice. I've added a few more bits to satisfy it. Fixes #78805
libc++ turned off _LIBCPP_ASSUME because turning every debug assert into __builtin_assume tripped https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609 However, this means we can't use _LIBCPP_ASSUME when there is a clear optimization intent. See llvm#78929 (comment) for discussion of a place where _LIBCPP_ASSUME would be valuable. Go ahead and fix this now, and adjust the comments. I don't think we want _LIBCPP_ASSUME in disabled asserts anyway, at least not in any of the hardened modes. This PR should have no impact on libc++ right now, since _LIBCPP_ASSUME is currently never called standalone, but I figure I can do this as a standalone PR since it's pretty straightforward.
libc++ turned off _LIBCPP_ASSUME because turning every debug assert into __builtin_assume tripped [1]. However, this means we can't use _LIBCPP_ASSUME when there is a clear optimization intent. See [2] for discussion of a place where _LIBCPP_ASSUME would be valuable. This patch fixes this by not undefining the definition of _LIBCPP_ASSUME and making sure that we don't attempt to `_LIBCPP_ASSSUME` every assertion in the library. [1]: https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609 [2]: #78929 (comment)
libc++ turned off _LIBCPP_ASSUME because turning every debug assert into __builtin_assume tripped [1]. However, this means we can't use _LIBCPP_ASSUME when there is a clear optimization intent. See [2] for discussion of a place where _LIBCPP_ASSUME would be valuable. This patch fixes this by not undefining the definition of _LIBCPP_ASSUME and making sure that we don't attempt to `_LIBCPP_ASSSUME` every assertion in the library. [1]: https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609 [2]: llvm#78929 (comment)
NB: This PR depends on #78876. Ignore the first commit when reviewing, and don't merge it until #78876 is resolved. When/if #78876 lands, I'll clean this up.This partially restores parity with the old, since removed debug build. We now can re-enable a bunch of the disabled tests. Some things of note:
bounded_iter
's converting constructor has never worked. It needs a friend declaration to access the otherbound_iter
instantiation's private fields.The old debug iterators also checked that callers did not try to compare iterators from different objects.
bounded_iter
does not currently do this, so I've left those disabled. However, I think we probably should add those. See std::copy and others bypass _LIBCPP_ABI_BOUNDED_ITERATORS and other hardened iterators #78771 (comment)The
std::vector
iterators are bounded up to capacity, not size. This makes for a weaker safety check. This is because the STL promises not to invalidate iterators when appending up to the capacity. Since we cannot retroactively update all the iterators onpush_back()
, I've instead sized it to the capacity. This is not as good, but at least will stop the iterator from going off the end of the buffer.There was also no test for this, so I've added one in the
std
directory.std::string
has two ambiguities to deal with. First, I opted not to size it against the capacity. https://eel.is/c++draft/string.require#4 says iterators are invalidated on an non-const operation. Second, whether the iterator can reach the NUL terminator. The previous debug tests and the special-case in https://eel.is/c++draft/string.access#2 suggest no. If either of these causes widespread problems, I figure we can revisit.resize_and_overwrite.pass.cpp
assumedstd::string
's iterator supporteds.begin().base()
, but I see no promise of this in the standard. GCC also doesn't support this. I fixed the test to usestd::to_address
.alignof.compile.pass.cpp
's pointer isn't enough of a real pointer. (It needs to satisfyNullablePointer
,LegacyRandomAccessIterator
, andLegacyContiguousIterator
.)__bounded_iter
seems to instantiate enough to notice. I've added a few more bits to satisfy it.Fixes #78805