Skip to content

[C23] Fix compound literals within function prototype #132097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,10 @@ C23 Feature Support
better diagnostic behavior for the ``va_start()`` macro in C23 and later.
This also updates the definition of ``va_start()`` in ``<stdarg.h>`` to use
the new builtin. Fixes #GH124031.
- Implemented `WG14 N2819 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2819.pdf>`_
which clarified that a compound literal used within a function prototype is
treated as if the compound literal were within the body rather than at file
scope.

Non-comprehensive list of changes in this release
-------------------------------------------------
Expand Down
20 changes: 13 additions & 7 deletions clang/lib/Sema/SemaExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7141,7 +7141,13 @@ Sema::BuildCompoundLiteralExpr(SourceLocation LParenLoc, TypeSourceInfo *TInfo,
return ExprError();
LiteralExpr = Result.get();

bool isFileScope = !CurContext->isFunctionOrMethod();
// We treat the compound literal as being at file scope if it's not in a
// function or method body, or within the function's prototype scope. This
// means the following compound literal is not at file scope:
// void func(char *para[(int [1]){ 0 }[0]);
const Scope *S = getCurScope();
bool IsFileScope = !CurContext->isFunctionOrMethod() &&
(!S || !S->isFunctionPrototypeScope());

// In C, compound literals are l-values for some reason.
// For GCC compatibility, in C++, file-scope array compound literals with
Expand All @@ -7162,20 +7168,20 @@ Sema::BuildCompoundLiteralExpr(SourceLocation LParenLoc, TypeSourceInfo *TInfo,
// FIXME: GCC supports compound literals of reference type, which should
// obviously have a value kind derived from the kind of reference involved.
ExprValueKind VK =
(getLangOpts().CPlusPlus && !(isFileScope && literalType->isArrayType()))
(getLangOpts().CPlusPlus && !(IsFileScope && literalType->isArrayType()))
? VK_PRValue
: VK_LValue;

if (isFileScope)
if (IsFileScope)
if (auto ILE = dyn_cast<InitListExpr>(LiteralExpr))
for (unsigned i = 0, j = ILE->getNumInits(); i != j; i++) {
Expr *Init = ILE->getInit(i);
ILE->setInit(i, ConstantExpr::Create(Context, Init));
}

auto *E = new (Context) CompoundLiteralExpr(LParenLoc, TInfo, literalType,
VK, LiteralExpr, isFileScope);
if (isFileScope) {
auto *E = new (Context) CompoundLiteralExpr(LParenLoc, TInfo, literalType, VK,
LiteralExpr, IsFileScope);
if (IsFileScope) {
if (!LiteralExpr->isTypeDependent() &&
!LiteralExpr->isValueDependent() &&
!literalType->isDependentType()) // C99 6.5.2.5p3
Expand All @@ -7191,7 +7197,7 @@ Sema::BuildCompoundLiteralExpr(SourceLocation LParenLoc, TypeSourceInfo *TInfo,
return ExprError();
}

if (!isFileScope && !getLangOpts().CPlusPlus) {
if (!IsFileScope && !getLangOpts().CPlusPlus) {
// Compound literals that have automatic storage duration are destroyed at
// the end of the scope in C; in C++, they're just temporaries.

Expand Down
33 changes: 28 additions & 5 deletions clang/test/AST/ByteCode/literals.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -849,13 +849,11 @@ namespace CompoundLiterals {
}
static_assert(get5() == 5, "");

constexpr int get6(int f = (int[]){1,2,6}[2]) { // ref-note {{subexpression not valid in a constant expression}} \
// ref-note {{declared here}}
constexpr int get6(int f = (int[]){1,2,6}[2]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also worth testing a few more interactions with C++ constexpr:

constexpr int* f(int*a=(int[]){1,2,3}) {return a;}
constinit int* a1 = f(); // error: pointer points to local variable
static_assert(f()[0] == 1); // Should pass.

constexpr int f2(int *x, int (*y)[*(x=(int[]){1,2,3})]) {
  return x[0];
}
constexpr int g = f2(0,0); // Should evaluate to 1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll add more tests.

constinit int* a1 = f(); // error: pointer points to local variable

Yes, this should fail (and it does).

constexpr int f2(int *x, int (*y)[*(x=(int[]){1,2,3})]) {

You are a bad person who should feel bad. :-D I do think that should evaluate to 1 (but only when calling f2(0)), but that's a pile of VLA code which we don't always handle well, especially in C++. I think a non-VLA example that's similar would be:

  constexpr int f2(int *x =(int[]){1,2,3}) {
    return x[0];
  }
  constexpr int g = f2(); // Should evaluate to 1?
  static_assert(g == 1, "");

which does behave how you'd expect.

The VLA example says object backing the pointer x will be destroyed at the end of the full-expression which may be a C++'ism not impacted by the C wording?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the VLA example, there are a few different weird interactions here... but the primary silly thing I'm doing is abusing VLAs to modify argument values. So, for example:

constexpr int f2(int x, int (*y)[(x=3)]) { return x; }
static_assert(f2(0,0)==3);

On top of that, we have to consider the question of the lifetime of compound literals defined inside the array bound.

But maybe we should just reject this sort of thing in C++ more aggressively, like gcc does, so we don't have to figure out all these weird cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the VLA example, there are a few different weird interactions here... but the primary silly thing I'm doing is abusing VLAs to modify argument values.

Yeah, which we already seem to get wrong: https://godbolt.org/z/Y4xEGWhM3

On top of that, we have to consider the question of the lifetime of compound literals defined inside the array bound.

I think that's what object backing the pointer x will be destroyed at the end of the full-expression is about in your previous VLA example. There is an ExprWithCleanups node for the assignment operator and that is a FullExpr. So I think we do get that correct -- I'll add it as a test case with comments explaining why it's rejected.

But maybe we should just reject this sort of thing in C++ more aggressively, like gcc does, so we don't have to figure out all these weird cases.

I tend to agree we should be far more restrictive with VLAs in C++ than we are today. But I also think that's a different PR than this one. WDYT?

return f;
}
static_assert(get6(6) == 6, "");
// FIXME: Who's right here?
static_assert(get6() == 6, ""); // ref-error {{not an integral constant expression}}
static_assert(get6() == 6, "");

constexpr int x = (int){3};
static_assert(x == 3, "");
Expand All @@ -875,8 +873,33 @@ namespace CompoundLiterals {
return m;
}
static_assert(get3() == 3, "");

constexpr int *f(int *a=(int[]){1,2,3}) { return a; } // both-note {{temporary created here}}
constinit int *a1 = f(); // both-error {{variable does not have a constant initializer}} \
both-note {{required by 'constinit' specifier here}} \
both-note {{pointer to subobject of temporary is not a constant expression}}
static_assert(f()[0] == 1); // Ok
#endif
};

constexpr int f2(int *x =(int[]){1,2,3}) {
return x[0];
}
constexpr int g = f2(); // Should evaluate to 1?
static_assert(g == 1, "");

// This example should be rejected because the lifetime of the compound
// literal assigned into x is that of the full expression, which is the
// parenthesized assignment operator. So the return statement is using a
// dangling pointer. FIXME: the note saying it's a read of a dereferenced
// null pointer suggests we're doing something odd during constant expression
// evaluation: I think it's still taking 'x' as being null from the call to
// f3() rather than tracking the assignment happening in the VLA.
constexpr int f3(int *x, int (*y)[*(x=(int[]){1,2,3})]) { // both-warning {{object backing the pointer x will be destroyed at the end of the full-expression}}
return x[0]; // both-note {{read of dereferenced null pointer is not allowed in a constant expression}}
}
constexpr int h = f3(0,0); // both-error {{constexpr variable 'h' must be initialized by a constant expression}} \
both-note {{in call to 'f3(nullptr, nullptr)'}}
}

namespace TypeTraits {
static_assert(__is_trivial(int), "");
Expand Down
17 changes: 9 additions & 8 deletions clang/test/C/C23/n2819.c
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 3
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
// RUN: %clang_cc1 -triple=x86_64 -emit-llvm -o - -std=c23 %s | FileCheck %s

/* WG14 N2819: No
* Disambiguate the storage class of some compound literals
*/

int *escaped;

// CHECK-LABEL: define dso_local i32 @f(
// CHECK-SAME: ptr noundef [[PTR:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[PTR_ADDR:%.*]] = alloca ptr, align 8
// CHECK-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
Expand All @@ -19,23 +20,23 @@ int f(int *ptr) { escaped = ptr; return 1; }

// CHECK-LABEL: define dso_local i32 @g(
// CHECK-SAME: ptr noundef [[PARA:%.*]]) #[[ATTR0]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[PARA_ADDR:%.*]] = alloca ptr, align 8
// CHECK-NEXT: [[DOTCOMPOUNDLITERAL:%.*]] = alloca [27 x i32], align 4
// CHECK-NEXT: store ptr [[PARA]], ptr [[PARA_ADDR]], align 8
// CHECK-NEXT: [[CALL:%.*]] = call i32 @f(ptr noundef @.compoundliteral)
// CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 4 [[DOTCOMPOUNDLITERAL]], i8 0, i64 108, i1 false)
// CHECK-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [27 x i32], ptr [[DOTCOMPOUNDLITERAL]], i64 0, i64 0
// CHECK-NEXT: [[CALL:%.*]] = call i32 @f(ptr noundef [[ARRAYDECAY]])
// CHECK-NEXT: [[TMP0:%.*]] = zext i32 [[CALL]] to i64
// CHECK-NEXT: ret i32 0
//
// FIXME: notice the we are using the global .compoundliteral object, not
// allocating a new object on each call to g(). That's what was clarified by
// N2819.
int g(char *para [f(( int [27]) { 0 })]) {
return 0;
}

// CHECK-LABEL: define dso_local i32 @main(
// CHECK-SAME: ) #[[ATTR0]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK-NEXT: store i32 0, ptr [[RETVAL]], align 4
// CHECK-NEXT: [[CALL:%.*]] = call i32 @g(ptr noundef null)
Expand Down
2 changes: 1 addition & 1 deletion clang/www/c_status.html
Original file line number Diff line number Diff line change
Expand Up @@ -715,7 +715,7 @@ <h2 id="c2x">C23 implementation status</h2>
<tr>
<td>Disambiguate the storage class of some compound literals</td>
<td><a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2819.pdf">N2819</a></td>
<td class="none" align="center">No</td>
<td class="unreleased" align="center">Clang 21</td>
</tr>
<tr>
<td>Add annotations for unreachable control flow v2</td>
Expand Down
Loading