-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[libc++] Reduce std::conjunction overhead #124259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
63da627
to
a8cb701
Compare
@llvm/pr-subscribers-libcxx Author: Nikolas Klauser (philnik777) ChangesThe old and new implementation of Full diff: https://github.com/llvm/llvm-project/pull/124259.diff 1 Files Affected:
diff --git a/libcxx/include/__type_traits/conjunction.h b/libcxx/include/__type_traits/conjunction.h
index ad9656acd47ecc..6b6717a50a468a 100644
--- a/libcxx/include/__type_traits/conjunction.h
+++ b/libcxx/include/__type_traits/conjunction.h
@@ -10,8 +10,6 @@
#define _LIBCPP___TYPE_TRAITS_CONJUNCTION_H
#include <__config>
-#include <__type_traits/conditional.h>
-#include <__type_traits/enable_if.h>
#include <__type_traits/integral_constant.h>
#include <__type_traits/is_same.h>
@@ -21,22 +19,29 @@
_LIBCPP_BEGIN_NAMESPACE_STD
-template <class...>
-using __expand_to_true _LIBCPP_NODEBUG = true_type;
+template <bool>
+struct _AndImpl;
-template <class... _Pred>
-__expand_to_true<__enable_if_t<_Pred::value>...> __and_helper(int);
+template <>
+struct _AndImpl<true> {
+ template <class _Res, class _First, class... _Rest>
+ using _Result _LIBCPP_NODEBUG =
+ typename _AndImpl<bool(_First::value) && sizeof...(_Rest) != 0>::template _Result<_First, _Rest...>;
+};
-template <class...>
-false_type __and_helper(...);
+template <>
+struct _AndImpl<false> {
+ template <class _Res, class...>
+ using _Result _LIBCPP_NODEBUG = _Res;
+};
// _And always performs lazy evaluation of its arguments.
//
// However, `_And<_Pred...>` itself will evaluate its result immediately (without having to
// be instantiated) since it is an alias, unlike `conjunction<_Pred...>`, which is a struct.
// If you want to defer the evaluation of `_And<_Pred...>` itself, use `_Lazy<_And, _Pred...>`.
-template <class... _Pred>
-using _And _LIBCPP_NODEBUG = decltype(std::__and_helper<_Pred...>(0));
+template <class... _Args>
+using _And _LIBCPP_NODEBUG = typename _AndImpl<sizeof...(_Args) != 0>::template _Result<true_type, _Args...>;
template <bool... _Preds>
struct __all_dummy;
@@ -46,22 +51,11 @@ struct __all : _IsSame<__all_dummy<_Pred...>, __all_dummy<((void)_Pred, true)...
#if _LIBCPP_STD_VER >= 17
-template <class...>
-struct _LIBCPP_NO_SPECIALIZATIONS conjunction : true_type {};
-
-_LIBCPP_DIAGNOSTIC_PUSH
-# if __has_warning("-Winvalid-specialization")
-_LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Winvalid-specialization")
-# endif
-template <class _Arg>
-struct conjunction<_Arg> : _Arg {};
-
-template <class _Arg, class... _Args>
-struct conjunction<_Arg, _Args...> : conditional_t<!bool(_Arg::value), _Arg, conjunction<_Args...>> {};
-_LIBCPP_DIAGNOSTIC_POP
+template <class... _Args>
+struct _LIBCPP_NO_SPECIALIZATIONS conjunction : _And<_Args...> {};
template <class... _Args>
-_LIBCPP_NO_SPECIALIZATIONS inline constexpr bool conjunction_v = conjunction<_Args...>::value;
+_LIBCPP_NO_SPECIALIZATIONS inline constexpr bool conjunction_v = _And<_Args...>::value;
#endif // _LIBCPP_STD_VER >= 17
|
Headsup: this is causing a clang seg fault. Working on a repro .. |
Probably the seg fault is due to going out of memory |
Our internal assessment of the situation is that this exposes a bug in the compiler but it will need time to get to the bottom of it. In the meantime we suggest to revert the patch till the problem is known and fixed @philnik777 what do you think ? |
@asmok-g I'd really like to have a reproducer before reverting. |
Indeed, please provide a reproducer. We're fine with reverting temporarily when we have a reproducer and a bug report that is being worked on, but without that we're essentially reverting because someone said so, which is not really workable for the project. |
Indeed. cvise is running so we'll see where it is tomorrow |
We might have overreduced the test, but ended up with the following code that makes the compiler exhaust the stack and SIGSEGV:
not yet sure which usage of |
That certainly looks overreduced. There is no recursion anchor (is it called that in english?), so we have an infinite loop. Interesting that there seems to be no depth limit on aliases. But that's most certainly not the actual problem, since we do have a recursion anchor in the actual implementation. I'd also be happy with a non-reduced case if you are able to shared it. (Assuming of course I can actually reproduce and verify that the problem is the new |
There's something we forgot to mention; Observation
Assessment of the internal clang team isThe problem is not in this PR. The PR just exposes the issue in clang. However, it's gonna take a long time to investigate. This is why we hope to revert in the meantime. Whatever repro (preprocessed file) we're going to provide won't prove this PR wrong. My reduction is very slow and is at 126k lines now, and we have another one at 50k lines. In any case, the problem will be unrelated to this PR. |
@philnik777 we have investigated further and found that this seems to boil down to Clang requiring more stack (likely deeper recursions) when instantiating In our code, we end up crashing with stack overflow when instantiating If you play around with the tuple sizes, you'll see that smaller numbers that don't fail result in much higher compile times with the new approach. The caveat is that there might've been more changes between Clang 19 and now, but we are pretty certain this commit is the root cause based on our rolling testing. Could you please take a look if compiling |
Note that I don't have any idea how we get from |
It turns out that the new implementation takes significantly more stack memory for some reason. This reverts commit 2696e4f.
@philnik777 Thanks for handling it. Note that this patch was also in |
It turns out that the new implementation takes significantly more stack memory for some reason. This reverts commit 2696e4f.
The old and new implementation of
_And
are very close in terms of performance according to my testing, but the new implementation can also be used to implementconjunction
, which make that ~50% faster.