Skip to content

[libc++] Simplify vector<bool>::__construct_at_end (Reopend) #119632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 29, 2025

Conversation

winner245
Copy link
Contributor

@winner245 winner245 commented Dec 11, 2024

This PR aims to simplify the implementation of __construct_at_end in vector<bool>, which currently contains duplicate initialization logic across its two overloads.

size_type __old_size = this->__size_;
this->__size_ += __n;
if (__old_size == 0 || ((__old_size - 1) / __bits_per_word) != ((this->__size_ - 1) / __bits_per_word)) {
if (this->__size_ <= __bits_per_word)
this->__begin_[0] = __storage_type(0);
else
this->__begin_[(this->__size_ - 1) / __bits_per_word] = __storage_type(0);
}

size_type __old_size = this->__size_;
this->__size_ += __n;
if (__old_size == 0 || ((__old_size - 1) / __bits_per_word) != ((this->__size_ - 1) / __bits_per_word)) {
if (this->__size_ <= __bits_per_word)
this->__begin_[0] = __storage_type(0);
else
this->__begin_[(this->__size_ - 1) / __bits_per_word] = __storage_type(0);
}

In my initial attempt, I sought to simplify the code by completely removing the duplicate initialization logic. However, thanks to @philnik777's review, it was pointed out that this approach introduced uninitialized loads and undefined behavior, resulting in the closure of the PR.

I am now reopening this PR with a refined approach. I have simplified the initialization logic into a more compact and readable form, while maintaining equivalent functionality. This not only simplifies the implementation but also eliminates code duplication. Additionally, this refactoring addresses a call to __construct_at_end in reserve, ensuring it meets the precondition of __n > 0, as required by __construct_at_end(, , __n).

@winner245 winner245 marked this pull request as ready for review December 12, 2024 14:23
@winner245 winner245 requested a review from a team as a code owner December 12, 2024 14:23
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Dec 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2024

@llvm/pr-subscribers-libcxx

Author: Peng Liu (winner245)

Changes

Following #119502, I realized that vector&lt;bool&gt;::__construct_at_end performs unnecessary work. This PR proposes removing the redundant initialization of storage bits in both overloads of __construct_at_end. This function serves as a useful helper for various constructors, assign, and reserve methods of the class. The specific code segment in question is:

if (__old_size == 0 || ((__old_size - 1) / __bits_per_word) != ((this-&gt;__size_ - 1) / __bits_per_word)) {
  if (this-&gt;__size_ &lt;= __bits_per_word)
    this-&gt;__begin_[0] = __storage_type(0);
  else
    this-&gt;__begin_[(this-&gt;__size_ - 1) / __bits_per_word] = __storage_type(0);
}


```cpp
if (__old_size == 0 || ((__old_size - 1) / __bits_per_word) != ((this-&gt;__size_ - 1) / __bits_per_word)) {
  if (this-&gt;__size_ &lt;= __bits_per_word)
    this-&gt;__begin_[0] = __storage_type(0);
  else
    this-&gt;__begin_[(this-&gt;__size_ - 1) / __bits_per_word] = __storage_type(0);
}

Rationale for Removal:
The purpose of the above code is to set every bit in the last storage word of __storage_type to 0 when the new size crosses the boundary of the current last storage word. However, setting the raw bits to 0 before an immediate assignment is completely unnecessary, as explained below:

  • Redundancy: The initialization of the last storage word to zero is immediately overwritten by the subsequent call to std::fill_n. Therefore, even without this step, the uninitialized bits in the range [0, __size_) will be correctly set by the std::fill_n call, ensuring that all bits of interest are properly set.

  • Safety against out-of-bounds access: The std::fill_n function itself operates within the valid range of the vector. Leaving the bits in the last storage word unset does not introduce any risk, as they remain out of bounds for vector&lt;bool&gt;.

The existing comprehensive tests for the constructors, assign, and reserve functions continue to pass successfully after the removal of the above code, thereby validating the changes in this PR.


Full diff: https://github.com/llvm/llvm-project/pull/119632.diff

1 Files Affected:

  • (modified) libcxx/include/__vector/vector_bool.h (+14-35)
diff --git a/libcxx/include/__vector/vector_bool.h b/libcxx/include/__vector/vector_bool.h
index 36eb7f350ac406..bc3411c4d97791 100644
--- a/libcxx/include/__vector/vector_bool.h
+++ b/libcxx/include/__vector/vector_bool.h
@@ -438,10 +438,22 @@ class _LIBCPP_TEMPLATE_VIS vector<bool, _Allocator> {
     return (__new_size + (__bits_per_word - 1)) & ~((size_type)__bits_per_word - 1);
   }
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 size_type __recommend(size_type __new_size) const;
-  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __construct_at_end(size_type __n, bool __x);
+
+  //  Default constructs __n objects starting at __end_
+  //  Precondition:  __n > 0
+  //  Precondition:  size() + __n <= capacity()
+  //  Postcondition:  size() == size() + __n
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __construct_at_end(size_type __n, bool __x) {
+    std::fill_n(end(), __n, __x);
+    this->__size_ += __n;
+  }
   template <class _InputIterator, class _Sentinel>
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void
-  __construct_at_end(_InputIterator __first, _Sentinel __last, size_type __n);
+  __construct_at_end(_InputIterator __first, _Sentinel __last, size_type __n) {
+    std::__copy(__first, __last, end());
+    this->__size_ += __n;
+  }
+
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __append(size_type __n, const_reference __x);
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 reference __make_ref(size_type __pos) _NOEXCEPT {
     return reference(__begin_ + __pos / __bits_per_word, __storage_type(1) << __pos % __bits_per_word);
@@ -530,39 +542,6 @@ vector<bool, _Allocator>::__recommend(size_type __new_size) const {
   return std::max(2 * __cap, __align_it(__new_size));
 }
 
-//  Default constructs __n objects starting at __end_
-//  Precondition:  __n > 0
-//  Precondition:  size() + __n <= capacity()
-//  Postcondition:  size() == size() + __n
-template <class _Allocator>
-inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void
-vector<bool, _Allocator>::__construct_at_end(size_type __n, bool __x) {
-  size_type __old_size = this->__size_;
-  this->__size_ += __n;
-  if (__old_size == 0 || ((__old_size - 1) / __bits_per_word) != ((this->__size_ - 1) / __bits_per_word)) {
-    if (this->__size_ <= __bits_per_word)
-      this->__begin_[0] = __storage_type(0);
-    else
-      this->__begin_[(this->__size_ - 1) / __bits_per_word] = __storage_type(0);
-  }
-  std::fill_n(__make_iter(__old_size), __n, __x);
-}
-
-template <class _Allocator>
-template <class _InputIterator, class _Sentinel>
-_LIBCPP_CONSTEXPR_SINCE_CXX20 void
-vector<bool, _Allocator>::__construct_at_end(_InputIterator __first, _Sentinel __last, size_type __n) {
-  size_type __old_size = this->__size_;
-  this->__size_ += __n;
-  if (__old_size == 0 || ((__old_size - 1) / __bits_per_word) != ((this->__size_ - 1) / __bits_per_word)) {
-    if (this->__size_ <= __bits_per_word)
-      this->__begin_[0] = __storage_type(0);
-    else
-      this->__begin_[(this->__size_ - 1) / __bits_per_word] = __storage_type(0);
-  }
-  std::__copy(__first, __last, __make_iter(__old_size));
-}
-
 template <class _Allocator>
 inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 vector<bool, _Allocator>::vector()
     _NOEXCEPT_(is_nothrow_default_constructible<allocator_type>::value)

@philnik777
Copy link
Contributor

This looks to me like you're introducing an uninitialized load, and the MSan CI seems to agree with me.

@winner245
Copy link
Contributor Author

You are right. I am just about to close.

@winner245 winner245 closed this Dec 12, 2024
@winner245 winner245 reopened this Jan 5, 2025
@winner245 winner245 marked this pull request as draft January 5, 2025 18:19
@winner245 winner245 force-pushed the construct_at_end branch 3 times, most recently from 723f953 to 83451c6 Compare January 5, 2025 21:37
@winner245 winner245 changed the title [libc++] Simplify vector<bool>::__construct_at_end implementation [libc++] Simplify vector<bool>::__construct_at_end (Reopend) Jan 5, 2025
@winner245 winner245 marked this pull request as ready for review January 5, 2025 23:54
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with your new proposed implementation.

@winner245 winner245 force-pushed the construct_at_end branch 3 times, most recently from 78e1780 to e3a0995 Compare January 24, 2025 16:17
@ldionne ldionne merged commit 67752f6 into llvm:main Jan 29, 2025
77 checks passed
@winner245 winner245 deleted the construct_at_end branch January 29, 2025 17:45
Comment on lines +564 to +565
if (end().__ctz_ != 0) // Ensure uninitialized leading bits in the last word are set to zero
std::fill_n(end(), __bits_per_word - end().__ctz_, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ensures that the unused leading bits in the last word are initialized, which serves the same purpose as the pre-initialization logic in the previous implementation:

size_type __old_size = this->__size_;
this->__size_ += __n;
if (__old_size == 0 || ((__old_size - 1) / __bits_per_word) != ((this->__size_ - 1) / __bits_per_word)) {
if (this->__size_ <= __bits_per_word)
this->__begin_[0] = __storage_type(0);
else
this->__begin_[(this->__size_ - 1) / __bits_per_word] = __storage_type(0);
}

Without this, we would have a Msan CI failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how is the previous fill_n not reading uninitialized elements? It has to load the bits of the last word after all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In constant evaluation, we have no uninitialized bits at all because __vallocate guarantees that the allocated memory is default-constructed. In non-constant evaluation, we effectively only reads from bits within the bounds of vector<bool> since we utilize bit masks. So in this case, the purpose of the initialization is to make sanitizer tools happy. According to my test, sanitizers do not complain as long as the unused bits in the last word are initialized to 0 within __construct_at_end, regardless of whether this occurs before or after the initialization of other words. However, without the initialization of unused bits, subsequent read from vector<bool> would trigger a Msan CI failure.

}
std::fill_n(__make_iter(__old_size), __n, __x);
if (end().__ctz_ != 0) // Ensure uninitialized leading bits in the last word are set to zero
std::fill_n(end(), __bits_per_word - end().__ctz_, 0);
}

template <class _Allocator>
template <class _InputIterator, class _Sentinel>
_LIBCPP_CONSTEXPR_SINCE_CXX20 void
vector<bool, _Allocator>::__construct_at_end(_InputIterator __first, _Sentinel __last, size_type __n) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test for this function where _InputIterator isn't a vector<bool> iterator? e.g. bool*. I don't see how the elements are initialized which makes me think that we may read from uninitialized memory when the __bit_iterator optimization in std::copy doesn't kick in. I don't think this is a problem in reality, since we're setting the bits one-by-one, but could cause some sanitizer to complain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern here. That's why I explicitly check if we have uninitialized bits, which could only exist in the last partial word. Previously, we pre-initialize the entire last word, whether it is a partial word or not. This patch implements it in a more compact form which initializes the bits to zero only when the last word is a partial word.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have basically the same question here: How does the __copy avoid the uninitialized read?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants