Skip to content

Ranged-for loop with _LIBCPP_ABI_BOUNDED_ITERATORS does not optimize runtime checks #78829

Closed
@davidben

Description

@davidben

When enabling bounded iterators, it seems Clang cannot optimize them out of a ranged-for loop. Here's a simple function that just sums over a span with a ranged-for loop, sum1, as well as a corresponding hand-written iterator loop, sum2.
https://godbolt.org/z/WqsPKdEoa

The left output, without _LIBCPP_ABI_BOUNDED_ITERATORS has no runtime checks, and Clang is even able to vectorize the loop. The right output, with _LIBCPP_ABI_BOUNDED_ITERATORS contains runtime checks with, in turn, impede the optimization.

Cause

I teased apart what __bounded_iter was doing. This is the assertion in question:

  // Return whether the given iterator is in the bounds of this __bounded_iter.
  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __in_bounds(_Iterator const& __iter) const {
    return __iter >= __begin_ && __iter < __end_;
  }

I've extracted that into a minimal sum3 function (same godbolt link), and that seems to reveal the problem. While Clang can prove that iter >= begin, it can't prove that iter < end. I suspect this is because iterator loops look like iter != end, not iter < end. In order to prove that iter < end, it must also know that begin <= end. Otherwise it's possible that iter was ahead of end to start with.

Indeed, if I add that assertion to the top, in sum4 (same godbolt again), Clang pays for a different undesirable runtime check outside the loop, but then figures it out! So fixing #78784 may have further benefits, if the analysis is extended further.

Alternative fix

Another way Clang could have figured it out is if we used iter != end instead of iter < end for the safety assertion. That's actually what Chromium's checked iterator does. However, libc++ cannot do this because it doesn't check operator+, etc., only operator*. It is possible for __current_ to overflow well past __end_ before we do the check.

I think libc++ should bounds-check operator+ too. First, once __current_ has advanced past __end_, we may have already hit undefined behavior because pointer arithmetic is locked to the allocation. So the check in operator* may be too late anyway. Second, #78771 describes a place where libc++'s memmove specialization breaks its own hardening checks! Since memmove specialization is pretty valuable, I think it's best to fix #78771 by bounds-checking operator+ and making sure the specialization takes triggers it early enough.

Now, once operator+ is bounds-checked, __bounded_iter can rely on __begin_ <= __current_ <= __end_. That means the safety check need only be __current_ != __end_, which should be more easily optimized out.

(CC @danakj)

Metadata

Metadata

Assignees

No one assigned

    Labels

    hardeningIssues related to the hardening effortlibc++libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions