Skip to content

[Clang] Propagate elide safe context through [[clang::coro_await_elidable_argument]] #108474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 18, 2024

Conversation

yuxuanchen1997
Copy link
Member

@yuxuanchen1997 yuxuanchen1997 commented Sep 13, 2024

Based on discussion in https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044/9, we are proposing a new attribute [[clang::coro_await_elidable_argument]]. [[clang::coro_await_elidable_argument]] is a function parameter attribute. It works in conjunction with [[clang::coro_await_elidable]] to propagate a safe elide context to a parameter or parameter pack if the function is called under a safe elide context.

To give a better definition, we introduce this new concept called a "safe elide context".

When a coroutine function returns a type attributed [[clang::coro_await_elidable]], a direct call expression therein that returns a prvalue of a type attributed [[clang::coro_await_elidable]] is said to be under a safe elide context if one of the following is true:

  • it is the immediate right-hand side operand to a co_await expression.
  • it is an argument to a [[clang::coro_await_elidable_argument]] parameter or parameter pack of another direct call expression under a safe elide context.

This attribute makes it possible to have code that composes coroutines, and the coroutines are still elidable.

  template <typename T>
  class [[clang::coro_await_elidable]] Task { ... };

  template <typename... T>
  class [[clang::coro_await_elidable]] WhenAll { ... };

  // `when_all` is a utility function that compose coroutines. It does not need
  // to be a coroutine to propagate.
  template <typename... T>
  WhenAll<T...> when_all([[clang::coro_await_elidable_argument]] Task<T> tasks...);

  Task<int> foo();
  Task<int> bar();
  Task<void> example1() {
    // `when_all``, `foo``, and `bar` are all elide safe because `when_all` is
    // under a safe elide context and propagated such contexts to foo and bar.
    co_await when_all(foo(), bar());
  }

@yuxuanchen1997 yuxuanchen1997 force-pushed the users/yuxuanchen1997/parm-attr-coro-must-await branch 2 times, most recently from b2cd6f3 to 73e3d27 Compare September 13, 2024 20:37
@yuxuanchen1997 yuxuanchen1997 force-pushed the users/yuxuanchen1997/parm-attr-coro-must-await branch from 73e3d27 to dc62399 Compare September 13, 2024 20:55
@yuxuanchen1997 yuxuanchen1997 marked this pull request as ready for review September 13, 2024 20:55
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" coroutines C++20 coroutines labels Sep 13, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 13, 2024

@llvm/pr-subscribers-coroutines

@llvm/pr-subscribers-clang

Author: Yuxuan Chen (yuxuanchen1997)

Changes

Based on discussion in https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044/9, we are proposing a new attribute [[clang::coro_must_await]]. [[clang::coro_must_await]] is a function parameter attribute. It works in conjunction with [[clang::coro_await_elidable]] to propagate a safe elide context to a parameter or parameter pack if the function is called under a safe elide context.

To give a better definition, we introduce this new concept called a "safe elide context".

When a coroutine function returns a type attributed [[clang::coro_await_elidable]], a direct call expression therein that returns a prvalue of a type attributed [[clang::coro_await_elidable]] is said to be under a safe elide context if one of the following is true:

  • it is the immediate right-hand side operand to a co_await expression.
  • it is an argument to a [[clang::coro_must_await]] parameter or parameter pack of another direct call expression under a safe elide context.

This attribute makes it possible to have code that composes coroutines, and the coroutines are still elidable.

  template &lt;typename T&gt;
  class [[clang::coro_await_elidable]] Task { ... };

  template &lt;typename... T&gt;
  class [[clang::coro_await_elidable]] WhenAll { ... };

  // `when_all` is a utility function that compose coroutines. It does not need
  // to be a coroutine to propagate.
  template &lt;typename... T&gt;
  WhenAll&lt;T...&gt; when_all([[clang::coro_must_await]] Task&lt;T&gt; tasks...);

  Task&lt;int&gt; foo();
  Task&lt;int&gt; bar();
  Task&lt;void&gt; example1() {
    // `when_all``, `foo``, and `bar` are all elide safe because `when_all` is
    // under a safe elide context and propagated such contexts to foo and bar.
    co_await when_all(foo(), bar());
  }

Full diff: https://github.com/llvm/llvm-project/pull/108474.diff

5 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+4-1)
  • (modified) clang/include/clang/Basic/Attr.td (+8)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+69-8)
  • (modified) clang/lib/Sema/SemaCoroutine.cpp (+28-12)
  • (modified) clang/test/CodeGenCoroutines/coro-await-elidable.cpp (+31)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 3929a9fb599259..54b60ed92d8e4c 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -249,7 +249,10 @@ Attribute Changes in Clang
   (#GH106864)
 
 - Introduced a new attribute ``[[clang::coro_await_elidable]]`` on coroutine return types
-  to express elideability at call sites where the coroutine is co_awaited as a prvalue.
+  to express elideability at call sites where the coroutine is invoked under a safe elide context.
+
+- Introduced a new attribute ``[[clang::coro_must_await]]`` on function parameters to
+  propagate safe elide context if such function is also under a safe elide context.
 
 Improvements to Clang's diagnostics
 -----------------------------------
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 70fad60d4edbb5..40aa57115a75ff 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1258,6 +1258,14 @@ def CoroAwaitElidable : InheritableAttr {
   let SimpleHandler = 1;
 }
 
+def CoroMustAwait : InheritableAttr {
+  let Spellings = [Clang<"coro_must_await">];
+  let Subjects = SubjectList<[ParmVar]>;
+  let LangOpts = [CPlusPlus];
+  let Documentation = [CoroMustAwaitDoc];
+  let SimpleHandler = 1;
+}
+
 // OSObject-based attributes.
 def OSConsumed : InheritableParamAttr {
   let Spellings = [Clang<"os_consumed">];
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 546e5100b79dd9..e8bdda8a20be70 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -8261,12 +8261,19 @@ def CoroAwaitElidableDoc : Documentation {
 The ``[[clang::coro_await_elidable]]`` is a class attribute which can be applied
 to a coroutine return type.
 
-When a coroutine function that returns such a type calls another coroutine function,
-the compiler performs heap allocation elision when the call to the coroutine function
-is immediately co_awaited as a prvalue. In this case, the coroutine frame for the
-callee will be a local variable within the enclosing braces in the caller's stack
-frame. And the local variable, like other variables in coroutines, may be collected
-into the coroutine frame, which may be allocated on the heap.
+When a coroutine function returns such a type, a direct call expression therein
+that returns a prvalue of a type attributed ``[[clang::coro_await_elidable]]``
+is said to be under a safe elide context if one of the following is true:
+- it is the immediate right-hand side operand to a co_await expression.
+- it is an argument to a [[clang::coro_must_await]] parameter or parameter pack
+of another direct call expression under a safe elide context.
+
+Do note that the safe elide context applies only to the call expression itself,
+and the context does not transitively include any of its subexpressions unless
+exceptional rules of ``[[clang::coro_must_await]]`` apply.
+
+The compiler performs heap allocation elision on call expressions under a safe
+elide context, if the callee is a coroutine.
 
 Example:
 
@@ -8281,8 +8288,62 @@ Example:
     co_await t;
   }
 
-The behavior is undefined if the caller coroutine is destroyed earlier than the
-callee coroutine.
+Such elision replaces the heap allocated activation frame of the callee coroutine
+with a local variable within the enclosing braces in the caller's stack frame.
+The local variable, like other variables in coroutines, may be collected into the
+coroutine frame, which may be allocated on the heap. The behavior is undefined
+if the caller coroutine is destroyed earlier than the callee coroutine.
+
+}];
+}
+
+def CoroMustAwaitDoc : Documentation {
+  let Category = DocCatDecl;
+  let Content = [{
+
+The ``[[clang::coro_must_await]]`` is a function parameter attribute. It works
+in conjunction with ``[[clang::coro_await_elidable]]`` to propagate a safe elide
+context to a parameter or parameter pack if the function is called under a safe
+elide context.
+
+This is sometimes necessary on utility functions whose role is to compose or
+modify the behavior of a callee coroutine.
+
+Example:
+
+.. code-block:: c++
+
+  template <typename T>
+  class [[clang::coro_await_elidable]] Task { ... };
+
+  template <typename... T>
+  class [[clang::coro_await_elidable]] WhenAll { ... };
+
+  // `when_all` is a utility function that compose coroutines. It does not need
+  // to be a coroutine to propagate.
+  template <typename... T>
+  WhenAll<T...> when_all([[clang::coro_must_await]] Task<T> tasks...);
+
+  Task<int> foo();
+  Task<int> bar();
+  Task<void> example1() {
+    // `when_all``, `foo``, and `bar` are all elide safe because `when_all` is
+    // under a safe elide context and propagated such contexts to foo and bar.
+    co_await when_all(foo(), bar());
+  }
+
+  Task<void> example2() {
+    // `when_all` and `bar` are elide safe. `foo` is not elide safe.
+    auto f = foo();
+    co_await when_all(f, bar());
+  }
+
+
+  Task<void> example3() {
+    // None of the calls are elide safe.
+    auto t = when_all(foo(), bar());
+    co_await t;
+  }
 
 }];
 }
diff --git a/clang/lib/Sema/SemaCoroutine.cpp b/clang/lib/Sema/SemaCoroutine.cpp
index a574d56646f3a2..d0eaf469e327a2 100644
--- a/clang/lib/Sema/SemaCoroutine.cpp
+++ b/clang/lib/Sema/SemaCoroutine.cpp
@@ -849,12 +849,30 @@ static bool isAttributedCoroAwaitElidable(const QualType &QT) {
   return Record && Record->hasAttr<CoroAwaitElidableAttr>();
 }
 
-static bool isCoroAwaitElidableCall(Expr *Operand) {
-  if (!Operand->isPRValue()) {
-    return false;
+static void applySafeElideContext(Expr *Operand) {
+  auto *Call = dyn_cast<CallExpr>(Operand->IgnoreImplicit());
+  if (!Call || !Call->isPRValue())
+    return;
+
+  if (!isAttributedCoroAwaitElidable(Call->getType()))
+    return;
+
+  Call->setCoroElideSafe();
+
+  // Check parameter
+  auto *Fn = llvm::dyn_cast_if_present<FunctionDecl>(Call->getCalleeDecl());
+  if (!Fn) {
+    Call->dump();
+    return;
   }
 
-  return isAttributedCoroAwaitElidable(Operand->getType());
+  size_t ParmIdx = 0;
+  for (ParmVarDecl *PD : Fn->parameters()) {
+    if (PD->hasAttr<CoroMustAwaitAttr>())
+      applySafeElideContext(Call->getArg(ParmIdx));
+
+    ParmIdx++;
+  }
 }
 
 // Attempts to resolve and build a CoawaitExpr from "raw" inputs, bailing out to
@@ -880,14 +898,12 @@ ExprResult Sema::BuildUnresolvedCoawaitExpr(SourceLocation Loc, Expr *Operand,
   }
 
   auto *RD = Promise->getType()->getAsCXXRecordDecl();
-  bool AwaitElidable =
-      isCoroAwaitElidableCall(Operand) &&
-      isAttributedCoroAwaitElidable(
-          getCurFunctionDecl(/*AllowLambda=*/true)->getReturnType());
-
-  if (AwaitElidable)
-    if (auto *Call = dyn_cast<CallExpr>(Operand->IgnoreImplicit()))
-      Call->setCoroElideSafe();
+
+  bool CurFnAwaitElidable = isAttributedCoroAwaitElidable(
+      getCurFunctionDecl(/*AllowLambda=*/true)->getReturnType());
+
+  if (CurFnAwaitElidable)
+    applySafeElideContext(Operand);
 
   Expr *Transformed = Operand;
   if (lookupMember(*this, "await_transform", RD, Loc)) {
diff --git a/clang/test/CodeGenCoroutines/coro-await-elidable.cpp b/clang/test/CodeGenCoroutines/coro-await-elidable.cpp
index 8512995dfad45a..fe7631ba2d9189 100644
--- a/clang/test/CodeGenCoroutines/coro-await-elidable.cpp
+++ b/clang/test/CodeGenCoroutines/coro-await-elidable.cpp
@@ -84,4 +84,35 @@ Task<int> nonelidable() {
   co_return 1;
 }
 
+// CHECK-LABEL: define{{.*}} @_Z8addTasksO4TaskIiES1_{{.*}} {
+Task<int> addTasks([[clang::coro_must_await]] Task<int> &&t1, Task<int> &&t2) {
+  int i1 = co_await t1;
+  int i2 = co_await t2;
+  co_return i1 + i2;
+}
+
+// CHECK-LABEL: define{{.*}} @_Z10returnSamei{{.*}} {
+Task<int> returnSame(int i) {
+  co_return i;
+}
+
+// CHECK-LABEL: define{{.*}} @_Z21elidableWithMustAwaitv{{.*}} {
+Task<int> elidableWithMustAwait() {
+  // CHECK: call void @_Z10returnSamei(ptr {{.*}}, i32 noundef 2) #[[ELIDE_SAFE]]
+  // CHECK-NOT: call void @_Z10returnSamei(ptr {{.*}}, i32 noundef 3) #[[ELIDE_SAFE]]
+  co_return co_await addTasks(returnSame(2), returnSame(3));
+}
+
+template <typename... Args>
+Task<int> sumAll([[clang::coro_must_await]] Args && ... tasks);
+
+// CHECK-LABEL: define{{.*}} @_Z16elidableWithPackv{{.*}} {
+Task<int> elidableWithPack() {
+  // CHECK-NOT: call void @_Z10returnSamei(ptr {{.*}}, i32 noundef 1) #[[ELIDE_SAFE]]
+  // CHECK: call void @_Z10returnSamei(ptr {{.*}}, i32 noundef 2) #[[ELIDE_SAFE]]
+  // CHECK: call void @_Z10returnSamei(ptr {{.*}}, i32 noundef 3) #[[ELIDE_SAFE]]
+  auto t = returnSame(1);
+  co_return co_await sumAll(t, returnSame(2), returnSame(3));
+}
+
 // CHECK: attributes #[[ELIDE_SAFE]] = { coro_elide_safe }

@yuxuanchen1997 yuxuanchen1997 force-pushed the users/yuxuanchen1997/parm-attr-coro-must-await branch from dc62399 to 3b5c9cc Compare September 13, 2024 21:11
Copy link
Member

@vogelsgesang vogelsgesang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this! Looking forward to using it 🙂

Call->setCoroElideSafe();

bool CurFnAwaitElidable = isAttributedCoroAwaitElidable(
getCurFunctionDecl(/*AllowLambda=*/true)->getReturnType());
Copy link
Member

@vogelsgesang vogelsgesang Sep 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we actually check if the return type of àwait_transform is marked as elideable, instead of the immediate argument to co_await? Or maybe we should check that both the immediate argument and the transformed awaitable are annotated?

(Not really related to this commit. I just noticed this as I was reviewing this code change here. Fell free to ship this PR without addressing this comment. In case this is actually something we want to change, we should probably do so in a separate PR, anyway...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the piece of code that performs this check. See applySafeElideContext.

But what you said is potentially a good point, we actually skipped checking two things: operator co_await on the callee return object, and the await_transform. As these operations may not be safe. The original intention was to make this a user's liability to ensure early caller destruction cannot happen in both customization points. It's not as granular as I'd like.

With this patch, we are actually making it possible. await_transform and operator co_await here are just function calls whose arguments can be attributed with [[clang::coro_must_await]].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see. So after this patch, we could actually check await_transform and also operator co_await and thereby get rid of the CurFnAwaitElidable check? Such that a ElideableTask coawaited within a NonElidableTask (as discussed in the other thread) would become elideable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So after this patch, we could actually check await_transform and also operator co_await and thereby get rid of the CurFnAwaitElidable check?

No, not yet. I am just saying that we can potentially check the entire expression built here including the CallerType::await_transform and CalleeType::operator co_await part. Right now we are just checking the immediate operand before any transformation customization points.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +8264 to +8267
When a coroutine function returns such a type, a direct call expression therein
that returns a prvalue of a type attributed ``[[clang::coro_await_elidable]]``
is said to be under a safe elide context if one of the following is true:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a coroutine function returns such a type, a direct call expression therein that returns a prvalue of a type attributed [[clang::coro_await_elidable]] [...]

If I am reading the code correctly, it is not actually a requirement that the containing / callee coroutine itself is also annotated as coro_await_elidable. Only the called coroutine needs to be annotated as coro_await_elidable.

I.e., we would also apply HALO for something like

  class [[clang::coro_await_elidable]] ElidableTask { ... };
  class NonElidableTask { ... };

  ElidableTask foo();
  NonElidableTask bar() {
    co_await foo(); // foo()'s coroutine frame on this line is elidable
  }

Or did I misread the code?

Assuming I understood code correctly, I would propose something like

A call to a coroutine function returning such a type is said to be safe-to-elide
if one of the following is true:
- it is the immediate right-hand side operand to a co_await expression.
- it is an argument to a [[clang::coro_must_await]] parameter or parameter pack
   of another safe-to-elide function call.

Do note that the safe-to-elide context applies only to the call expression itself,
and the context does not transitively include any of its subexpressions unless
``[[clang::coro_must_await]]`` is used to opt-in to transivitely applying safe-to-elide.

The compiler performs heap allocation elision on call expressions on safe-to-elide
calls, if the callee is a coroutine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am reading the code correctly, it is not actually a requirement that the containing / callee coroutine itself is also annotated as coro_await_elidable.

It is required. See CurFnAwaitElidable. Your example call to foo() of NonElidableTask won't have any elisions because we don't control NonElidableTask::Promise::await_transform.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I misread the code.

In general: Is checking if the callee coroutine is annotated actually necessary? To my understanding from the other thread, so far it was necessary because of await_transform and operator co_await. After we make this checking more fine-granular, are there any other reasons why we would still need to check if the callee coroutine is annotated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is checking if the callee coroutine is annotated actually necessary?

Depends on the new spec which I haven't thought about yet but most likely I'll say it's still necessary for simplicity. We can make await_transform and operator co_await propagate the safe elide context. Just having the awaiter type attributed, and asserting that a prvalue of the awaiter returned from operator co_await must satisfy the requirements may do the work. I haven't thought it through.

@yuxuanchen1997 yuxuanchen1997 force-pushed the users/yuxuanchen1997/parm-attr-coro-must-await branch from 3b5c9cc to a1783d3 Compare September 16, 2024 18:22
@yuxuanchen1997 yuxuanchen1997 changed the title [Clang] Propagate elide safe context through [[clang::coro_must_await]] [Clang] Propagate elide safe context through [[clang::coro_await_elidable_argument]] Sep 16, 2024
@yuxuanchen1997 yuxuanchen1997 self-assigned this Sep 16, 2024
@yuxuanchen1997 yuxuanchen1997 force-pushed the users/yuxuanchen1997/parm-attr-coro-must-await branch from a1783d3 to c32b36e Compare September 17, 2024 16:49
Copy link
Member

@vogelsgesang vogelsgesang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but at the same time: I am not experienced in the clang-frontend, so please wait for an approval of a more experienced clang-frontend contributor

@yuxuanchen1997 yuxuanchen1997 merged commit e1b40dc into main Sep 18, 2024
9 checks passed
@yuxuanchen1997 yuxuanchen1997 deleted the users/yuxuanchen1997/parm-attr-coro-must-await branch September 18, 2024 05:58
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 18, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-qemu running on sanitizer-buildbot3 while building clang at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/139/builds/3883

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
[72/77] Generating ScudoUnitTestsObjects.gtest-all.cc.mips64.o
[73/77] Generating ScudoCxxUnitTest-mips64-Test
[74/77] Generating ScudoCUnitTest-mips64-Test
[75/77] Generating ScudoUnitTestsObjects.combined_test.cpp.mips64.o
[76/77] Generating ScudoUnitTest-mips64-Test
[76/77] Running Scudo Standalone tests
llvm-lit: /home/b/sanitizer-x86_64-linux-qemu/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 157 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
TIMEOUT: ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/67/139 (157 of 157)
******************** TEST 'ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/67/139' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-x86_64-linux-qemu/build/llvm_build2_debug_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test-ScudoStandalone-Unit-893406-67-139.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=139 GTEST_SHARD_INDEX=67 /home/b/sanitizer-x86_64-linux-qemu/build/qemu_build/qemu-mips64 -L /usr/mips64-linux-gnuabi64 /home/b/sanitizer-x86_64-linux-qemu/build/llvm_build2_debug_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test
--

Note: This is test shard 68 of 139.
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from ScudoCombinedDeathTestBasicCombined13_TestConditionVariableConfig
[ RUN      ] ScudoCombinedDeathTestBasicCombined13_TestConditionVariableConfig.BasicCombined13
Stats: SizeClassAllocator64: 2M mapped (0M rss) in 33 allocations; remains 33; ReleaseToOsIntervalMs = 1000
  00 (    64): mapped:    256K popped:      26 pushed:       0 inuse:     26 total:    104 releases:      0 last released:      0K latest pushed bytes:      4K region: 0x55557683f000 (0x555576838000)
  24 (  8720): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:     59K region: 0x5555e6840000 (0x5555e6838000)
  25 ( 11664): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:     79K region: 0x55574683a000 (0x555746838000)
  26 ( 14224): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:     97K region: 0x5555b6840000 (0x5555b6838000)
  27 ( 16400): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:    112K region: 0x5556c6846000 (0x5556c6838000)
  28 ( 18448): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:    126K region: 0x5556a6848000 (0x5556a6838000)
  30 ( 29456): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:    201K region: 0x555616843000 (0x555616838000)
  32 ( 65552): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      3 releases:      0 last released:      0K latest pushed bytes:    128K region: 0x5556d6842000 (0x5556d6838000)
Stats: MapAllocator: allocated 65 times (6504K), freed 65 times (6504K), remains 0 (0K) max 0M, Fragmented 0K
Secondary Cache Disabled
Stats: Quarantine: batches: 0; bytes: 0 (user: 0); chunks: 0 (capacity: 0); 0% chunks used; 0% memory overhead
Quarantine limits: global: 0K; thread local: 0K
Stats: SharedTSDs: 4 available; total 8
  Shared TSD[0]:
    00 (    64): cached:   12 max:   26
    24 (  8720): cached:    1 max:    2
    25 ( 11664): cached:    1 max:    2
    26 ( 14224): cached:    1 max:    2
    27 ( 16400): cached:    1 max:    2
    28 ( 18448): cached:    1 max:    2
    30 ( 29456): cached:    1 max:    2
    32 ( 65552): cached:    1 max:    2
  Shared TSD[1]:
    No block is cached.
  Shared TSD[2]:
    No block is cached.
  Shared TSD[3]:
Step 24 (scudo debug_mips64_qemu) failure: scudo debug_mips64_qemu (failure)
...
[72/77] Generating ScudoUnitTestsObjects.gtest-all.cc.mips64.o
[73/77] Generating ScudoCxxUnitTest-mips64-Test
[74/77] Generating ScudoCUnitTest-mips64-Test
[75/77] Generating ScudoUnitTestsObjects.combined_test.cpp.mips64.o
[76/77] Generating ScudoUnitTest-mips64-Test
[76/77] Running Scudo Standalone tests
llvm-lit: /home/b/sanitizer-x86_64-linux-qemu/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 157 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
TIMEOUT: ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/67/139 (157 of 157)
******************** TEST 'ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/67/139' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-x86_64-linux-qemu/build/llvm_build2_debug_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test-ScudoStandalone-Unit-893406-67-139.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=139 GTEST_SHARD_INDEX=67 /home/b/sanitizer-x86_64-linux-qemu/build/qemu_build/qemu-mips64 -L /usr/mips64-linux-gnuabi64 /home/b/sanitizer-x86_64-linux-qemu/build/llvm_build2_debug_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test
--

Note: This is test shard 68 of 139.
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from ScudoCombinedDeathTestBasicCombined13_TestConditionVariableConfig
[ RUN      ] ScudoCombinedDeathTestBasicCombined13_TestConditionVariableConfig.BasicCombined13
Stats: SizeClassAllocator64: 2M mapped (0M rss) in 33 allocations; remains 33; ReleaseToOsIntervalMs = 1000
  00 (    64): mapped:    256K popped:      26 pushed:       0 inuse:     26 total:    104 releases:      0 last released:      0K latest pushed bytes:      4K region: 0x55557683f000 (0x555576838000)
  24 (  8720): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:     59K region: 0x5555e6840000 (0x5555e6838000)
  25 ( 11664): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:     79K region: 0x55574683a000 (0x555746838000)
  26 ( 14224): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:     97K region: 0x5555b6840000 (0x5555b6838000)
  27 ( 16400): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:    112K region: 0x5556c6846000 (0x5556c6838000)
  28 ( 18448): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:    126K region: 0x5556a6848000 (0x5556a6838000)
  30 ( 29456): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      8 releases:      0 last released:      0K latest pushed bytes:    201K region: 0x555616843000 (0x555616838000)
  32 ( 65552): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      3 releases:      0 last released:      0K latest pushed bytes:    128K region: 0x5556d6842000 (0x5556d6838000)
Stats: MapAllocator: allocated 65 times (6504K), freed 65 times (6504K), remains 0 (0K) max 0M, Fragmented 0K
Secondary Cache Disabled
Stats: Quarantine: batches: 0; bytes: 0 (user: 0); chunks: 0 (capacity: 0); 0% chunks used; 0% memory overhead
Quarantine limits: global: 0K; thread local: 0K
Stats: SharedTSDs: 4 available; total 8
  Shared TSD[0]:
    00 (    64): cached:   12 max:   26
    24 (  8720): cached:    1 max:    2
    25 ( 11664): cached:    1 max:    2
    26 ( 14224): cached:    1 max:    2
    27 ( 16400): cached:    1 max:    2
    28 ( 18448): cached:    1 max:    2
    30 ( 29456): cached:    1 max:    2
    32 ( 65552): cached:    1 max:    2
  Shared TSD[1]:
    No block is cached.
  Shared TSD[2]:
    No block is cached.
  Shared TSD[3]:

tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
facebook-github-bot pushed a commit to facebook/folly that referenced this pull request Oct 14, 2024
Summary:
# Context
Recent upstream LLVM development gives us this opportunity to elide heap allocations for coroutine function calls. Our folly::coro's heap allocation costs us about $4M every year fleet wide. This patch potentially gives us a decent chunk of capacity savings.

This diff adds two attributes to folly:
## [[clang::coro_await_elidable]]
This attribute is introduced by llvm/llvm-project#99282.
This attribute is designed to make the following case always apply HALO
```
class [[clang::coro_await_elidable]] Task { ... };
Task<void> foo();
Task<void> bar() {
  co_await foo();
}
```
This requires 1) bar is a coroutine returns a type attributed as such; and 2) `foo()` to be immediately co_awaited.

Note: if the example above had `bar()` call itself recursively, the elide won't happen because we cannot determine the necessary frame size for potentially unbound recursion. Same goes for mutually recursive functions.

This won't work on the fan out case for the same reason. This optimization is off limits if we can't statically determine the necessary frame size for the outer most coroutine.

## [[clang::coro_await_elidable_argument]]

This attribute is introduced by llvm/llvm-project#108474.
The attribute propagates a "safe elide context" as defined in the summary of the PR.

The motivation of this attribute is to support eliding foo in the following case:
```
class [[clang::coro_await_elidable]] Task { ... };
Task<void> foo();
Task<void> bar() {
  co_await co_nothrow(foo());
}
```
`co_nothrow` itself is not a coroutine. `foo()` in this case is not immediately `co_await`ed. Hence not elidable. However, if we add this "coro_await_elidable_argument" attribute to co_nothrow's parameter (or parameter pack), we allow the compiler to treat its arguments as safe to elide as well as long as both co_nothrow() is awaited as a prvalue, and `foo()` is immediately passed to `co_nothrow`'s param as a prvalue.

This can also be useful for composition of senders in the future.

## Why is this safe to elide?
Generally, we depend on the fact that `folly::coro::Task` does not provide the user a way to destroy the caller while the callee is in a running or leave it in a resumeable state. And `co_nothrow`'s awaitable object doesn't try to destroy caller either.

## How many more cases are elided after this change?
IG is the top coroutine user according to this [strobelight](https://fburl.com/strobelight/dgx7cyzv). With this change, we found 3009 out of 4292 coroutine calls are elidable. Among the 1283 non-elidable cases, only 2 had a profile hotness greater than zero.

Reviewed By: yfeldblum

Differential Revision: D58956574

fbshipit-source-id: da8be360a03a34dec7111004a1af44e20d163839
facebook-github-bot pushed a commit to facebook/hhvm that referenced this pull request Oct 14, 2024
Summary:
# Context
Recent upstream LLVM development gives us this opportunity to elide heap allocations for coroutine function calls. Our folly::coro's heap allocation costs us about $4M every year fleet wide. This patch potentially gives us a decent chunk of capacity savings.

This diff adds two attributes to folly:
## [[clang::coro_await_elidable]]
This attribute is introduced by llvm/llvm-project#99282.
This attribute is designed to make the following case always apply HALO
```
class [[clang::coro_await_elidable]] Task { ... };
Task<void> foo();
Task<void> bar() {
  co_await foo();
}
```
This requires 1) bar is a coroutine returns a type attributed as such; and 2) `foo()` to be immediately co_awaited.

Note: if the example above had `bar()` call itself recursively, the elide won't happen because we cannot determine the necessary frame size for potentially unbound recursion. Same goes for mutually recursive functions.

This won't work on the fan out case for the same reason. This optimization is off limits if we can't statically determine the necessary frame size for the outer most coroutine.

## [[clang::coro_await_elidable_argument]]

This attribute is introduced by llvm/llvm-project#108474.
The attribute propagates a "safe elide context" as defined in the summary of the PR.

The motivation of this attribute is to support eliding foo in the following case:
```
class [[clang::coro_await_elidable]] Task { ... };
Task<void> foo();
Task<void> bar() {
  co_await co_nothrow(foo());
}
```
`co_nothrow` itself is not a coroutine. `foo()` in this case is not immediately `co_await`ed. Hence not elidable. However, if we add this "coro_await_elidable_argument" attribute to co_nothrow's parameter (or parameter pack), we allow the compiler to treat its arguments as safe to elide as well as long as both co_nothrow() is awaited as a prvalue, and `foo()` is immediately passed to `co_nothrow`'s param as a prvalue.

This can also be useful for composition of senders in the future.

## Why is this safe to elide?
Generally, we depend on the fact that `folly::coro::Task` does not provide the user a way to destroy the caller while the callee is in a running or leave it in a resumeable state. And `co_nothrow`'s awaitable object doesn't try to destroy caller either.

## How many more cases are elided after this change?
IG is the top coroutine user according to this [strobelight](https://fburl.com/strobelight/dgx7cyzv). With this change, we found 3009 out of 4292 coroutine calls are elidable. Among the 1283 non-elidable cases, only 2 had a profile hotness greater than zero.

Reviewed By: yfeldblum

Differential Revision: D58956574

fbshipit-source-id: da8be360a03a34dec7111004a1af44e20d163839
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category coroutines C++20 coroutines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants