-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[C2y] Add documentation to conform to WG14 N3262; NFC #98146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This paper is about whether a copy of a va_list object which was not produced by calling va_copy is valid to use or not. While this may work on some targets, we explicitly document it as undefined behavior for all targets so there's not confusion as to when it's valid or not. It's not a burden for a user to use va_copy explicitly.
The paper in question can be found here: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3262.pdf |
@llvm/pr-subscribers-clang Author: Aaron Ballman (AaronBallman) ChangesThis paper is about whether a copy of a va_list object which was not produced by calling va_copy is valid to use or not. While this may work on some targets, we explicitly document it as undefined behavior for all targets so there's not confusion as to when it's valid or not. It's not a burden for a user to use va_copy explicitly. Full diff: https://github.com/llvm/llvm-project/pull/98146.diff 2 Files Affected:
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index d9439d49a2e29..90ba6d8e29987 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -3977,7 +3977,9 @@ standard library ``<stdarg.h>`` header:
* ``__builtin_va_list``
-A predefined typedef for the target-specific ``va_list`` type.
+A predefined typedef for the target-specific ``va_list`` type. It is undefined
+behavior to use a copy of a value of this type unless the copy was produced by
+calling ``__builtin_va_copy``.
* ``void __builtin_va_start(__builtin_va_list list, <parameter-name>)``
diff --git a/clang/test/C/C2y/n3262.c b/clang/test/C/C2y/n3262.c
new file mode 100644
index 0000000000000..3ff2062d88dde
--- /dev/null
+++ b/clang/test/C/C2y/n3262.c
@@ -0,0 +1,20 @@
+// RUN: %clang_cc1 -verify -std=c2y -Wall -pedantic %s
+// expected-no-diagnostics
+
+/* WG14 N3262: Yes
+ * Usability of a byte-wise copy of va_list
+ *
+ * NB: Clang explicitly documents this as being undefined behavior. A
+ * diagnostic is produced for some targets but not for others for assignment or
+ * initialization, but no diagnostic is possible to produce for use with memcpy
+ * in the general case, nor with a manual bytewise copy via a for loop.
+ *
+ * Therefore, nothing is tested in this file; it serves as a reminder that we
+ * validated our documentation against the paper. See
+ * clang/docs/LanguageExtensions.rst for more details.
+ *
+ * FIXME: it would be nice to add ubsan support for recognizing when an invalid
+ * copy is made and diagnosing on copy (or on use of the copied va_list).
+ */
+
+int main() {}
|
As far as I know, clang doesn't currently support any targets where va_copy is not just a memcpy. The most common implementation is just a void pointer; the more complicated implementations involve multiple memory buffers, but those buffers are not actually part of the va_list, so va_copy is still just a trivial copy. But if we don't want worry about it, I guess this is fine. |
I figured we want the behavior to be well-defined or undefined consistently rather than making users think about it on a per-target basis, and it wasn't clear to me that every target (including future targets we don't yet support) would be able to cope with a byte-wise copy, and since |
I also have trouble imagining why a target would ever want to make I wouldn't want to lock us out of supporting a target that required a non-trivial |
It's more about correctness than optimization opportunities -- do we want to promise users that the behavior will always work in all circumstances on a bytewise copy? What about code like this:
(The situation I'm specifically worried about is when the
it's possible that this works fine, but it's also possible that such an ABI would have different pointer values for different va_list objects within the same function.)
I'm not super happy about "this is well-defined if you happen to be on a target that makes it well-defined" because there's basically no reasonable way for a user to find that information out themselves. I think we need to provide explicit documentation for which targets are doing what if we're not using a blanket rule, but I don't feel comfortable making those assertions for the various ABIs on my own. |
Every
Well, if we're going to document that this is allowed, we should document what targets it's allowed on. I'm not arguing that we shouldn't document it. |
But do separate va_list objects within the same function always have the same pointer values? I was thinking the "saved state" pointers might actually point to different memory regions to support different restoration points, but I could be way off base there.
I had the unpleasant experience of trying to figure out what targets it's allowed on. We have five links to ABI resources regarding va_list in
Given the ease of calling Alternatively, we could document that it's supported on all targets, test whatever codegen Clang emits, assume all the targets support this, and then document any unsupported targets discovered in the wild as explicitly unsupported. I'm a bit less comfortable with that, but when both the codegen code owners tell me they it's very unlikely this won't work on any targets we care about, I can certainly trust that expertise! |
The pointer values in a specific
I have no objection to requiring |
Ah! Thank you for the explanation, that makes a lot of sense.
Thank you for the clarification. So you're happy with the current state of the PR, or are there some changes you'd like to see made? |
Oh, I completely spaced on this before, but of course there are constraints on As a result, the standard is clearly requiring that The ownership of the value must be understood in C terms, not in terms of a C++-like object model. A How precisely we want to document this is an interesting question, but I do think it has to be more subtle than "primitive copies are UB", because we clearly can't make |
Good catch! That said, a lot of these definitions of
Hmmm, yeah, we don't want to make
? |
Ah right, I'd forgotten that some ABIs use that array trick to get it to pass by reference, and you're right that that makes it ill-formed to simply assign around. I like your idea of specifically making it UB to copy with |
Perhaps also update llvm-project/clang/www/c_status.html Lines 1283 to 1285 in 345861b
|
Oh what the heck, I thought I already did that! :-D But yes, I'll update that when landing, thank you! |
This paper is about whether a copy of a va_list object which was not produced by calling va_copy is valid to use or not. While this may work on some targets, we explicitly document it as undefined behavior for all targets so there's not confusion as to when it's valid or not. It's not a burden for a user to use va_copy explicitly.
This paper is about whether a copy of a va_list object which was not produced by calling va_copy is valid to use or not. While this may work on some targets, we explicitly document it as undefined behavior for all targets so there's not confusion as to when it's valid or not. It's not a burden for a user to use va_copy explicitly.