-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[libc++][hardening] Check bounds on arithmetic in __bounded_iter #78876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
49d9168
89e4ffc
007ff76
d38a00d
17cfcf1
c49a028
f9812d3
d65e50d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,13 +31,20 @@ _LIBCPP_BEGIN_NAMESPACE_STD | |
// Iterator wrapper that carries the valid range it is allowed to access. | ||
// | ||
// This is a simple iterator wrapper for contiguous iterators that points | ||
// within a [begin, end) range and carries these bounds with it. The iterator | ||
// ensures that it is pointing within that [begin, end) range when it is | ||
// dereferenced. | ||
// within a [begin, end] range and carries these bounds with it. The iterator | ||
// ensures that it is pointing within [begin, end) range when it is | ||
// dereferenced. It also ensures that it is never iterated outside of | ||
// [begin, end]. This is important for two reasons: | ||
// | ||
// Arithmetic operations are allowed and the bounds of the resulting iterator | ||
// are not checked. Hence, it is possible to create an iterator pointing outside | ||
// its range, but it is not possible to dereference it. | ||
// 1. It allows `operator*` and `operator++` bounds checks to be `iter != end`. | ||
// This is both less for the optimizer to prove, and aligns with how callers | ||
// typically use iterators. | ||
// | ||
// 2. Advancing an iterator out of bounds is undefined behavior (see the table | ||
// in [input.iterators]). In particular, when the underlying iterator is a | ||
// pointer, it is undefined at the language level (see [expr.add]). If | ||
// bounded iterators exhibited this undefined behavior, we risk compiler | ||
// optimizations deleting non-redundant bounds checks. | ||
template <class _Iterator, class = __enable_if_t< __libcpp_is_contiguous_iterator<_Iterator>::value > > | ||
struct __bounded_iter { | ||
using value_type = typename iterator_traits<_Iterator>::value_type; | ||
|
@@ -51,8 +58,8 @@ struct __bounded_iter { | |
|
||
// Create a singular iterator. | ||
// | ||
// Such an iterator does not point to any object and is conceptually out of bounds, so it is | ||
// not dereferenceable. Observing operations like comparison and assignment are valid. | ||
// Such an iterator points past the end of an empty span, so it is not dereferenceable. | ||
// Observing operations like comparison and assignment are valid. | ||
_LIBCPP_HIDE_FROM_ABI __bounded_iter() = default; | ||
|
||
_LIBCPP_HIDE_FROM_ABI __bounded_iter(__bounded_iter const&) = default; | ||
|
@@ -70,18 +77,20 @@ struct __bounded_iter { | |
|
||
private: | ||
// Create an iterator wrapping the given iterator, and whose bounds are described | ||
// by the provided [begin, end) range. | ||
// by the provided [begin, end] range. | ||
// | ||
davidben marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// This constructor does not check whether the resulting iterator is within its bounds. | ||
// However, it does check that the provided [begin, end) range is a valid range (that | ||
// is, begin <= end). | ||
// The constructor does not check whether the resulting iterator is within its bounds. It is a | ||
// responsibility of the container to ensure that the given bounds are valid. | ||
// | ||
// Since it is non-standard for iterators to have this constructor, __bounded_iter must | ||
// be created via `std::__make_bounded_iter`. | ||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 explicit __bounded_iter( | ||
_Iterator __current, _Iterator __begin, _Iterator __end) | ||
: __current_(__current), __begin_(__begin), __end_(__end) { | ||
_LIBCPP_ASSERT_INTERNAL(__begin <= __end, "__bounded_iter(current, begin, end): [begin, end) is not a valid range"); | ||
_LIBCPP_ASSERT_INTERNAL( | ||
__begin <= __current, "__bounded_iter(current, begin, end): current and begin are inconsistent"); | ||
_LIBCPP_ASSERT_INTERNAL( | ||
__current <= __end, "__bounded_iter(current, begin, end): current and end are inconsistent"); | ||
} | ||
|
||
template <class _It> | ||
|
@@ -90,30 +99,37 @@ struct __bounded_iter { | |
public: | ||
// Dereference and indexing operations. | ||
// | ||
// These operations check that the iterator is dereferenceable, that is within [begin, end). | ||
// These operations check that the iterator is dereferenceable. Since the class invariant is | ||
// that the iterator is always within `[begin, end]`, we only need to check it's not pointing to | ||
// `end`. This is easier for the optimizer because it aligns with the `iter != container.end()` | ||
// checks that typical callers already use (see | ||
// https://github.com/llvm/llvm-project/issues/78829). | ||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator*() const _NOEXCEPT { | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__in_bounds(__current_), "__bounded_iter::operator*: Attempt to dereference an out-of-range iterator"); | ||
__current_ != __end_, "__bounded_iter::operator*: Attempt to dereference an iterator at the end"); | ||
return *__current_; | ||
} | ||
|
||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 pointer operator->() const _NOEXCEPT { | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__in_bounds(__current_), "__bounded_iter::operator->: Attempt to dereference an out-of-range iterator"); | ||
__current_ != __end_, "__bounded_iter::operator->: Attempt to dereference an iterator at the end"); | ||
return std::__to_address(__current_); | ||
} | ||
|
||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 reference operator[](difference_type __n) const _NOEXCEPT { | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__in_bounds(__current_ + __n), "__bounded_iter::operator[]: Attempt to index an iterator out-of-range"); | ||
__n >= __begin_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator past the start"); | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__n < __end_ - __current_, "__bounded_iter::operator[]: Attempt to index an iterator at or past the end"); | ||
return __current_[__n]; | ||
} | ||
|
||
// Arithmetic operations. | ||
// | ||
// These operations do not check that the resulting iterator is within the bounds, since that | ||
// would make it impossible to create a past-the-end iterator. | ||
// These operations check that the iterator remains within `[begin, end]`. | ||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator++() _NOEXCEPT { | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__current_ != __end_, "__bounded_iter::operator++: Attempt to advance an iterator past the end"); | ||
++__current_; | ||
return *this; | ||
} | ||
|
@@ -124,6 +140,8 @@ struct __bounded_iter { | |
} | ||
|
||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator--() _NOEXCEPT { | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__current_ != __begin_, "__bounded_iter::operator--: Attempt to rewind an iterator past the start"); | ||
--__current_; | ||
return *this; | ||
} | ||
|
@@ -134,6 +152,10 @@ struct __bounded_iter { | |
} | ||
|
||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator+=(difference_type __n) _NOEXCEPT { | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__n >= __begin_ - __current_, "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"); | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__n <= __end_ - __current_, "__bounded_iter::operator+=: Attempt to advance an iterator past the end"); | ||
__current_ += __n; | ||
return *this; | ||
} | ||
|
@@ -151,6 +173,10 @@ struct __bounded_iter { | |
} | ||
|
||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __bounded_iter& operator-=(difference_type __n) _NOEXCEPT { | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would there be any value in implementing this in terms of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem is that requires negating |
||
__n <= __current_ - __begin_, "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"); | ||
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS( | ||
__n >= __current_ - __end_, "__bounded_iter::operator-=: Attempt to advance an iterator past the end"); | ||
__current_ -= __n; | ||
return *this; | ||
} | ||
|
@@ -197,15 +223,10 @@ struct __bounded_iter { | |
} | ||
|
||
private: | ||
// Return whether the given iterator is in the bounds of this __bounded_iter. | ||
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __in_bounds(_Iterator const& __iter) const { | ||
return __iter >= __begin_ && __iter < __end_; | ||
} | ||
|
||
template <class> | ||
friend struct pointer_traits; | ||
_Iterator __current_; // current iterator | ||
_Iterator __begin_, __end_; // valid range represented as [begin, end) | ||
_Iterator __begin_, __end_; // valid range represented as [begin, end] | ||
}; | ||
|
||
template <class _It> | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,174 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
// UNSUPPORTED: c++03, c++11, c++14, c++17 | ||
|
||
// Make sure that std::span's iterators check for OOB accesses when the debug mode is enabled. | ||
|
||
// REQUIRES: has-unix-headers, libcpp-has-abi-bounded-iterators | ||
// UNSUPPORTED: libcpp-hardening-mode=none | ||
|
||
#include <span> | ||
|
||
#include "check_assertion.h" | ||
|
||
struct Foo { | ||
int x; | ||
}; | ||
|
||
template <typename Iter> | ||
void test_iterator(Iter begin, Iter end, bool reverse) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The errors produced by a reverse iterator can be a bit confusing for the user. @ldionne Do you think this is worth the trouble to try to fix? I suppose we could implement a separate checked reverse iterator type, but perhaps that's overkill. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I think the test is fine as-is, those are implementation specific anyway, so this is good enough (IMO). |
||
std::ptrdiff_t distance = std::distance(begin, end); | ||
|
||
// Dereferencing an iterator at the end. | ||
{ | ||
TEST_LIBCPP_ASSERT_FAILURE( | ||
*end, | ||
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator*: Attempt to dereference an iterator at the end"); | ||
#if _LIBCPP_STD_VER >= 20 | ||
// In C++20 mode, std::reverse_iterator implements operator->, but not operator*, with | ||
// std::prev instead of operator--. std::prev ultimately calls operator+ | ||
TEST_LIBCPP_ASSERT_FAILURE( | ||
end->x, | ||
reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator->: Attempt to dereference an iterator at the end"); | ||
#else | ||
TEST_LIBCPP_ASSERT_FAILURE( | ||
end->x, | ||
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator->: Attempt to dereference an iterator at the end"); | ||
#endif | ||
} | ||
|
||
// Incrementing an iterator past the end. | ||
{ | ||
[[maybe_unused]] const char* msg = | ||
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator++: Attempt to advance an iterator past the end"; | ||
auto it = end; | ||
TEST_LIBCPP_ASSERT_FAILURE(it++, msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(++it, msg); | ||
} | ||
|
||
// Decrementing an iterator past the start. | ||
{ | ||
[[maybe_unused]] const char* msg = | ||
reverse ? "__bounded_iter::operator++: Attempt to advance an iterator past the end" | ||
: "__bounded_iter::operator--: Attempt to rewind an iterator past the start"; | ||
auto it = begin; | ||
TEST_LIBCPP_ASSERT_FAILURE(it--, msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(--it, msg); | ||
} | ||
|
||
// Advancing past the end with operator+= and operator+. | ||
{ | ||
[[maybe_unused]] const char* msg = | ||
reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator+=: Attempt to advance an iterator past the end"; | ||
auto it = end; | ||
TEST_LIBCPP_ASSERT_FAILURE(it += 1, msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(end + 1, msg); | ||
it = begin; | ||
TEST_LIBCPP_ASSERT_FAILURE(it += (distance + 1), msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(begin + (distance + 1), msg); | ||
} | ||
|
||
// Advancing past the end with operator-= and operator-. | ||
{ | ||
[[maybe_unused]] const char* msg = | ||
reverse ? "__bounded_iter::operator+=: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator-=: Attempt to advance an iterator past the end"; | ||
auto it = end; | ||
TEST_LIBCPP_ASSERT_FAILURE(it -= (-1), msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(end - (-1), msg); | ||
it = begin; | ||
TEST_LIBCPP_ASSERT_FAILURE(it -= (-distance - 1), msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(begin - (-distance - 1), msg); | ||
} | ||
|
||
// Rewinding past the start with operator+= and operator+. | ||
{ | ||
[[maybe_unused]] const char* msg = | ||
reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end" | ||
: "__bounded_iter::operator+=: Attempt to rewind an iterator past the start"; | ||
auto it = begin; | ||
TEST_LIBCPP_ASSERT_FAILURE(it += (-1), msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(begin + (-1), msg); | ||
it = end; | ||
TEST_LIBCPP_ASSERT_FAILURE(it += (-distance - 1), msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(end + (-distance - 1), msg); | ||
} | ||
|
||
// Rewinding past the start with operator-= and operator-. | ||
{ | ||
[[maybe_unused]] const char* msg = | ||
reverse ? "__bounded_iter::operator+=: Attempt to advance an iterator past the end" | ||
: "__bounded_iter::operator-=: Attempt to rewind an iterator past the start"; | ||
auto it = begin; | ||
TEST_LIBCPP_ASSERT_FAILURE(it -= 1, msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(begin - 1, msg); | ||
it = end; | ||
TEST_LIBCPP_ASSERT_FAILURE(it -= (distance + 1), msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(end - (distance + 1), msg); | ||
} | ||
|
||
// Out-of-bounds operator[]. | ||
{ | ||
[[maybe_unused]] const char* end_msg = | ||
reverse ? "__bounded_iter::operator--: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator[]: Attempt to index an iterator at or past the end"; | ||
[[maybe_unused]] const char* past_end_msg = | ||
reverse ? "__bounded_iter::operator-=: Attempt to rewind an iterator past the start" | ||
: "__bounded_iter::operator[]: Attempt to index an iterator at or past the end"; | ||
[[maybe_unused]] const char* past_start_msg = | ||
reverse ? "__bounded_iter::operator-=: Attempt to advance an iterator past the end" | ||
: "__bounded_iter::operator[]: Attempt to index an iterator past the start"; | ||
TEST_LIBCPP_ASSERT_FAILURE(begin[distance], end_msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(begin[distance + 1], past_end_msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(begin[-1], past_start_msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(begin[-99], past_start_msg); | ||
|
||
auto it = begin + 1; | ||
TEST_LIBCPP_ASSERT_FAILURE(it[distance - 1], end_msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(it[distance], past_end_msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(it[-2], past_start_msg); | ||
TEST_LIBCPP_ASSERT_FAILURE(it[-99], past_start_msg); | ||
} | ||
} | ||
|
||
int main(int, char**) { | ||
// span<T>::iterator | ||
{ | ||
Foo array[] = {{0}, {1}, {2}}; | ||
std::span<Foo> const span(array, 3); | ||
test_iterator(span.begin(), span.end(), /*reverse=*/false); | ||
} | ||
|
||
// span<T, N>::iterator | ||
{ | ||
Foo array[] = {{0}, {1}, {2}}; | ||
std::span<Foo, 3> const span(array, 3); | ||
test_iterator(span.begin(), span.end(), /*reverse=*/false); | ||
} | ||
|
||
// span<T>::reverse_iterator | ||
{ | ||
Foo array[] = {{0}, {1}, {2}}; | ||
std::span<Foo> const span(array, 3); | ||
test_iterator(span.rbegin(), span.rend(), /*reverse=*/true); | ||
} | ||
|
||
// span<T, N>::reverse_iterator | ||
{ | ||
Foo array[] = {{0}, {1}, {2}}; | ||
std::span<Foo, 3> const span(array, 3); | ||
test_iterator(span.rbegin(), span.rend(), /*reverse=*/true); | ||
} | ||
|
||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this patch: can we do it for any random access iterator? Is there any reason why we need a contiguous iterator?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would work with every random access iterator, in the sense of producing coherent behavior, but there are a couple of subtleties:
A general random access iterator (I'm thinking
std::map
) may have slightly different performance properties. In particular, I don't think we'd want to writebegin + n
in<algorithm>
withstd::map
because that'll end up walking the iterator a second time, and I'm not positive the first walk can be easily deleted. And since we don't unwrap the map iterators instd::copy
, it's not necessary to check ahead of time. We can just lean on the wrappedoperator++
check.Replacing
iter
withiter2 = bounded(begin, iter, end)
means thatiter2
becomes invalidated wheneverend
gets invalidated, not justiter
. That's fine for contiguous ranges because the end iterator is just another pointer into the allocation. But the more complex containers, irritatingly, have different invalidation rules forend
and general iterators! See Hardening checks should detect dereferencing map.end() and set.end() #80212 (comment) where I considered and then sadly had to reject basically this strategy forstd::map
.We may not need to store both
begin
andend
for a general random access iterator. For example, if we ignored the iterator invalidation problem above,std::map
only needs(iter, end)
.begin
is only used to stop an errantoperator--
, and we can already detect that from the tree structure itself. (The only reason we can't detectend
is that libc++ is extra clever and doesn't even haveparent
andright
pointers on the end node at all. So attempting to read them will immediately go OOB.)So my inclination would be to wait to see how we decide to solve the problem for non-contiguous random access and then make a decision about whether to generalize
__bounded_iter
then. My guess would be that we wouldn't end up wanting to reuse it anyway.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random access iterators are required to have O(1) complexity for
distance
and other operations likeit + n
, sostd::map::iterator
is bidirectional, but not random-access. I get your point about maybe not wanting to implement__bounded_iter
the same all the time, but I do think that the properties we use that hold for contiguous iterators also hold for random access iterators. Needs to be confirmed though.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I hadn't realized random access iterators were not the
std::map
ones! (I don't actually know the STL very well at the level of iterator categories.). Yeah, maybe it would be useful to generalize it? I guess we can see when we get to hardening some random-access, non-contiguous iterator. Since__bounded_iter
isn't public API or anything, we only need to care about what libc++'s own containers do.Either way, as you say, unrelated to this patch. 😁