Skip to content

[libc++] Simplify the implementation of reserve() and shrink_to_fit() #113453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 28, 2024

Conversation

philnik777
Copy link
Contributor

Since we changed the implementation of reserve(size_type) to only ever extend,
it doesn't make a ton of sense anymore to have __shrink_or_extend, since the code
paths of reserve and shrink_to_fit are now almost completely separate.

This patch splits up __shrink_or_extend so that the individual parts are in reserve
and shrink_to_fit depending on where they are needed.

Copy link

github-actions bot commented Oct 23, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff aaa0dd2f05ff957a171a87e78578dddc59fc49c2 34f59690a5135fe3b6075d83d6ba2cb219469660 --extensions  -- libcxx/include/string
View the diff from clang-format here.
diff --git a/libcxx/include/string b/libcxx/include/string
index cdc399361e..157ca6a640 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -3373,7 +3373,7 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::re
 
   __annotation_guard __g(*this);
   auto __allocation = std::__allocate_at_least(__alloc_, __recommend(__requested_capacity) + 1);
-  auto __size = size();
+  auto __size       = size();
   __begin_lifetime(__allocation.ptr, __allocation.count);
   traits_type::copy(std::__to_address(__allocation.ptr), data(), __size + 1);
   if (__is_long())

@philnik777 philnik777 marked this pull request as ready for review October 23, 2024 14:59
@philnik777 philnik777 requested a review from a team as a code owner October 23, 2024 14:59
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Oct 23, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2024

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

Since we changed the implementation of reserve(size_type) to only ever extend,
it doesn't make a ton of sense anymore to have __shrink_or_extend, since the code
paths of reserve and shrink_to_fit are now almost completely separate.

This patch splits up __shrink_or_extend so that the individual parts are in reserve
and shrink_to_fit depending on where they are needed.


Full diff: https://github.com/llvm/llvm-project/pull/113453.diff

1 Files Affected:

  • (modified) libcxx/include/string (+62-62)
diff --git a/libcxx/include/string b/libcxx/include/string
index 4b5017f5e7753f..55b174f4db987c 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -1874,8 +1874,6 @@ private:
   operator==(const basic_string<char, char_traits<char>, _Alloc>& __lhs,
              const basic_string<char, char_traits<char>, _Alloc>& __rhs) _NOEXCEPT;
 
-  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __shrink_or_extend(size_type __target_capacity);
-
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS bool
   __is_long() const _NOEXCEPT {
     if (__libcpp_is_constant_evaluated() && __builtin_constant_p(__rep_.__l.__is_long_)) {
@@ -2060,6 +2058,21 @@ private:
 #endif
   }
 
+  // Disable ASan annotations and enable them again when going out of scope. It is assumed that the string is in a valid
+  // state at that point, so `size()` can be called safely.
+  struct [[__nodiscard__]] __annotation_guard {
+    __annotation_guard(const __annotation_guard&) = delete;
+    __annotation_guard& operator=(const __annotation_guard&) = delete;
+
+    _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 __annotation_guard(basic_string& __str) : __str_(__str) {
+      __str_.__annotate_delete();
+    }
+
+    _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 ~__annotation_guard() { __str_.__annotate_new(__str_.size()); }
+
+    basic_string& __str_;
+  };
+
   template <size_type __a>
   static _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 size_type __align_it(size_type __s) _NOEXCEPT {
     return (__s + (__a - 1)) & ~(__a - 1);
@@ -3340,7 +3353,16 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::re
   if (__target_capacity == capacity())
     return;
 
-  __shrink_or_extend(__target_capacity);
+  __annotation_guard __g(*this);
+  auto __allocation = std::__allocate_at_least(__alloc_, __target_capacity + 1);
+  auto __size = size();
+  __begin_lifetime(__allocation.ptr, __allocation.count);
+  traits_type::copy(std::__to_address(__allocation.ptr), data(), __size + 1);
+  if (__is_long())
+    __alloc_traits::deallocate(__alloc_, __get_long_pointer(), __get_long_size() + 1);
+  __set_long_cap(__allocation.count);
+  __set_long_size(__size);
+  __set_long_pointer(__allocation.ptr);
 }
 
 template <class _CharT, class _Traits, class _Allocator>
@@ -3349,70 +3371,48 @@ inline _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocat
   if (__target_capacity == capacity())
     return;
 
-  __shrink_or_extend(__target_capacity);
-}
-
-template <class _CharT, class _Traits, class _Allocator>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX20 void
-basic_string<_CharT, _Traits, _Allocator>::__shrink_or_extend(size_type __target_capacity) {
-  __annotate_delete();
-  size_type __cap = capacity();
-  size_type __sz  = size();
-
-  pointer __new_data, __p;
-  bool __was_long, __now_long;
   if (__fits_in_sso(__target_capacity)) {
-    __was_long = true;
-    __now_long = false;
-    __new_data = __get_short_pointer();
-    __p        = __get_long_pointer();
-  } else {
-    if (__target_capacity > __cap) {
-      // Extend
-      // - called from reserve should propagate the exception thrown.
-      auto __allocation = std::__allocate_at_least(__alloc_, __target_capacity + 1);
-      __new_data        = __allocation.ptr;
-      __target_capacity = __allocation.count - 1;
-    } else {
-      // Shrink
-      // - called from shrink_to_fit should not throw.
-      // - called from reserve may throw but is not required to.
+    if (!__is_long())
+      return;
+    __annotation_guard __g(*this);
+    auto __ptr = __get_long_pointer();
+    auto __size = __get_long_size();
+    auto __cap = __get_long_cap();
+    traits_type::copy(std::__to_address(__get_short_pointer()), data(), __size + 1);
+    __alloc_traits::deallocate(__alloc_, __ptr, __cap);
+    __set_short_size(__size);
+    return;
+  }
+
+  // Shrink
+  // - called from shrink_to_fit should not throw.
+  // - called from reserve may throw but is not required to.
 #if _LIBCPP_HAS_EXCEPTIONS
-      try {
+  try {
 #endif // _LIBCPP_HAS_EXCEPTIONS
-        auto __allocation = std::__allocate_at_least(__alloc_, __target_capacity + 1);
-
-        // The Standard mandates shrink_to_fit() does not increase the capacity.
-        // With equal capacity keep the existing buffer. This avoids extra work
-        // due to swapping the elements.
-        if (__allocation.count - 1 > __target_capacity) {
-          __alloc_traits::deallocate(__alloc_, __allocation.ptr, __allocation.count);
-          __annotate_new(__sz); // Undoes the __annotate_delete()
-          return;
-        }
-        __new_data        = __allocation.ptr;
-        __target_capacity = __allocation.count - 1;
+    __annotation_guard __g(*this);
+    auto __size = size();
+    auto __allocation = std::__allocate_at_least(__alloc_, __target_capacity + 1);
+
+    // The Standard mandates shrink_to_fit() does not increase the capacity.
+    // With equal capacity keep the existing buffer. This avoids extra work
+    // due to swapping the elements.
+    if (__allocation.count - 1 > __target_capacity) {
+      __alloc_traits::deallocate(__alloc_, __allocation.ptr, __allocation.count);
+      __annotate_new(__size); // Undoes the __annotate_delete()
+      return;
+    }
+
+    __begin_lifetime(__allocation.ptr, __allocation.count);
+    traits_type::copy(std::__to_address(__allocation.ptr), data(), size() + 1);
+    __alloc_traits::deallocate(__alloc_, __get_long_pointer(), __get_long_cap());
+    __set_long_cap(__allocation.count);
+    __set_long_pointer(__allocation.ptr);
 #if _LIBCPP_HAS_EXCEPTIONS
-      } catch (...) {
-        return;
-      }
+  } catch (...) {
+    return;
+  }
 #endif // _LIBCPP_HAS_EXCEPTIONS
-    }
-    __begin_lifetime(__new_data, __target_capacity + 1);
-    __now_long = true;
-    __was_long = __is_long();
-    __p        = __get_pointer();
-  }
-  traits_type::copy(std::__to_address(__new_data), std::__to_address(__p), size() + 1);
-  if (__was_long)
-    __alloc_traits::deallocate(__alloc_, __p, __cap + 1);
-  if (__now_long) {
-    __set_long_cap(__target_capacity + 1);
-    __set_long_size(__sz);
-    __set_long_pointer(__new_data);
-  } else
-    __set_short_size(__sz);
-  __annotate_new(__sz);
 }
 
 template <class _CharT, class _Traits, class _Allocator>

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review paid off! Looks like there are many things we can improve / fix in the existing code. I think we should split this into a few PRs:

  1. Introduce __scope_guard and simplify the __annotate stuff in the existing code.
  2. Fix the bug with shrink_to_fit never increases capacity. This needs an additional test.
  3. Remove the unnecessary capacity checks inside reserve()
  4. Then, this patch.

@philnik777 philnik777 changed the title [libc++][NFC] Simplify the implementation of reserve() and shrink_to_fit() [libc++] Simplify the implementation of reserve() and shrink_to_fit() Nov 4, 2024
@philnik777 philnik777 force-pushed the simplify_string_shrink_extend branch 2 times, most recently from d28118d to eb17a44 Compare November 13, 2024 09:22
@philnik777 philnik777 force-pushed the simplify_string_shrink_extend branch from eb17a44 to 34f5969 Compare November 25, 2024 12:11
@ldionne
Copy link
Member

ldionne commented Nov 26, 2024

I went over the whole code again after your rebases, and this LGTM. Thanks for the simplification.

@philnik777 philnik777 force-pushed the simplify_string_shrink_extend branch from 34f5969 to f3e4a9a Compare November 28, 2024 22:07
@philnik777 philnik777 merged commit d648eed into llvm:main Nov 28, 2024
9 of 10 checks passed
@philnik777 philnik777 deleted the simplify_string_shrink_extend branch November 28, 2024 22:07
d0k added a commit that referenced this pull request Nov 29, 2024
…to_fit() (#113453)"

This reverts commit d648eed. Breaks
anything that relies on sized deallocation, e.g. asan and tcmalloc.
Copy link
Member

@d0k d0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaks with sized deallocation. See inline comment, reverted this in 59f57be

__begin_lifetime(__allocation.ptr, __allocation.count);
traits_type::copy(std::__to_address(__allocation.ptr), data(), __size + 1);
if (__is_long())
__alloc_traits::deallocate(__alloc_, __get_long_pointer(), __get_long_size() + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should pass capacity, not size.

philnik777 added a commit to philnik777/llvm-project that referenced this pull request Feb 5, 2025
…_to_fit() (llvm#113453)"

The capacity is now passed correctly.

This reverts commit 59f57be.
philnik777 added a commit to philnik777/llvm-project that referenced this pull request Feb 6, 2025
…_to_fit() (llvm#113453)"

The capacity is now passed correctly.

This reverts commit 59f57be.
philnik777 added a commit that referenced this pull request Feb 6, 2025
…_to_fit() (#113453)" (#125888)

The capacity is now passed correctly and a test for this path is added.

Since we changed the implementation of `reserve(size_type)` to only ever
extend,
it doesn't make a ton of sense anymore to have `__shrink_or_extend`,
since the code
paths of `reserve` and `shrink_to_fit` are now almost completely
separate.

This patch splits up `__shrink_or_extend` so that the individual parts
are in `reserve`
and `shrink_to_fit` depending on where they are needed.

This reverts commit 59f57be.
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…_to_fit() (llvm#113453)" (llvm#125888)

The capacity is now passed correctly and a test for this path is added.

Since we changed the implementation of `reserve(size_type)` to only ever
extend,
it doesn't make a ton of sense anymore to have `__shrink_or_extend`,
since the code
paths of `reserve` and `shrink_to_fit` are now almost completely
separate.

This patch splits up `__shrink_or_extend` so that the individual parts
are in `reserve`
and `shrink_to_fit` depending on where they are needed.

This reverts commit 59f57be.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants