Skip to content

[llvm][DebugInfo][clang] Finalize all declaration subprograms in DIBuilder::finalize() #139914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dzhidzhoev
Copy link
Member

@dzhidzhoev dzhidzhoev commented May 14, 2025

DIBuilder began tracking definition subprograms and finalizing them in DIBuilder::finalize() in eb1bb4e.
Currently, finalizeSubprogram() attaches local variables, imported entities, and labels to the retainedNodes: field of a corresponding subprogram.

After 75819ae, the definition and some declaration subprograms are finalized in DIBuilder::finalize():
AllSubprograms holds references to definition subprograms.
AllRetainTypes holds references to declaration subprograms.
For DISubprogram elements of both variables, finalizeSubprogram() was called there.

However, retainTypes() is not necessarily called for every declaration subprogram (as in 40a3fcb).

DIBuilder clients may also want to attach DILocalVariables to declaration subprograms, for example, in 58bdf8f.

Thus, the finalizeSubprogram() function is called for all definition subprograms in DIBuilder::finalize() because they are stored in the AllSubprograms by the CreateFunction(isDefinition: true) call. But for the declaration subprograms, it should be called manually.

I think this feature of the DIBuilder interface is somewhat unclear, and AllSubprograms could just be used for holding and finalizing all DISubprograms.

…ilder::finalize()

DIBuilder began tracking definition subprograms and finalizing them
in DIBuilder::finalize() in eb1bb4e.
Currently, finalizeSubprogram() attaches local variables and labels
to the `retainedNodes:` field of a corresponding subprogram.

After 75819ae, the definition and some declaration subprograms were
finalized in DIBuilder::finalize():
`AllSubprograms` held references to definition subprograms.
`RetainValues` held references to declaration subprograms.
For DISubprogram elements of both members, finalizeSubprogram()
was called there.

However, retainTypes() is not necessarily called for every declaration
subprogram (as in 40a3fcb).

DIBuilder clients may also want to attach DILocalVariables to declaration
subprograms, for example, in 58bdf8f.

Thus, the finalizeSubprogram() function is called for all definition
subprograms in DIBuilder::finalize() because they are stored
in the `AllSubprograms` during the `CreateFunction(isDefinition: true)` call.
But for the definition subprograms, it should be called manually.

I think this feature of the DIBuilder interface is somewhat unclear,
and `AllSubprograms` could just be used for holding and finalizing
all DISubprograms.
@dzhidzhoev dzhidzhoev self-assigned this May 14, 2025
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. debuginfo llvm:ir labels May 14, 2025
@llvmbot
Copy link
Member

llvmbot commented May 14, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Vladislav Dzhidzhoev (dzhidzhoev)

Changes

DIBuilder began tracking definition subprograms and finalizing them in DIBuilder::finalize() in eb1bb4e.
Currently, finalizeSubprogram() attaches local variables and labels to the retainedNodes: field of a corresponding subprogram.

After 75819ae, the definition and some declaration subprograms are finalized in DIBuilder::finalize():
AllSubprograms holds references to definition subprograms.
RetainValues holds references to declaration subprograms.
For DISubprogram elements of both variables, finalizeSubprogram() was called there.

However, retainTypes() is not necessarily called for every declaration subprogram (as in 40a3fcb).

DIBuilder clients may also want to attach DILocalVariables to declaration subprograms, for example, in 58bdf8f.

Thus, the finalizeSubprogram() function is called for all definition subprograms in DIBuilder::finalize() because they are stored in the AllSubprograms during the CreateFunction(isDefinition: true) call. But for the declaration subprograms, it should be called manually.

I think this feature of the DIBuilder interface is somewhat unclear, and AllSubprograms could just be used for holding and finalizing all DISubprograms.


Full diff: https://github.com/llvm/llvm-project/pull/139914.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+1-3)
  • (modified) llvm/lib/IR/DIBuilder.cpp (+2-4)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 2a11eebf1b682..bac113289f507 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -4611,7 +4611,7 @@ void CGDebugInfo::EmitFunctionDecl(GlobalDecl GD, SourceLocation Loc,
 
   // Preserve btf_decl_tag attributes for parameters of extern functions
   // for BPF target. The parameters created in this loop are attached as
-  // DISubprogram's retainedNodes in the subsequent finalizeSubprogram call.
+  // DISubprogram's retainedNodes in the DIBuilder::finalize() call.
   if (IsDeclForCallSite && CGM.getTarget().getTriple().isBPF()) {
     if (auto *FD = dyn_cast<FunctionDecl>(D)) {
       llvm::DITypeRefArray ParamTypes = STy->getTypeArray();
@@ -4628,8 +4628,6 @@ void CGDebugInfo::EmitFunctionDecl(GlobalDecl GD, SourceLocation Loc,
 
   if (IsDeclForCallSite)
     Fn->setSubprogram(SP);
-
-  DBuilder.finalizeSubprogram(SP);
 }
 
 void CGDebugInfo::EmitFuncDeclForCallSite(llvm::CallBase *CallOrInvoke,
diff --git a/llvm/lib/IR/DIBuilder.cpp b/llvm/lib/IR/DIBuilder.cpp
index 90da9f3acfe57..35e61caf9a241 100644
--- a/llvm/lib/IR/DIBuilder.cpp
+++ b/llvm/lib/IR/DIBuilder.cpp
@@ -940,8 +940,7 @@ DISubprogram *DIBuilder::createFunction(
       SPFlags, IsDefinition ? CUNode : nullptr, TParams, Decl, nullptr,
       ThrownTypes, Annotations, TargetFuncName);
 
-  if (IsDefinition)
-    AllSubprograms.push_back(Node);
+  AllSubprograms.push_back(Node);
   trackIfUnresolved(Node);
   return Node;
 }
@@ -978,8 +977,7 @@ DISubprogram *DIBuilder::createMethod(
       Flags, SPFlags, IsDefinition ? CUNode : nullptr, TParams, nullptr,
       nullptr, ThrownTypes);
 
-  if (IsDefinition)
-    AllSubprograms.push_back(SP);
+  AllSubprograms.push_back(SP);
   trackIfUnresolved(SP);
   return SP;
 }

@dzhidzhoev
Copy link
Member Author

Gentle ping

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose what this is encoding is a (subtle?) assumption about how definition-subprograms are modelled: they're created once and are immutable. Wheras declaration-functions can be forwarded declared and then have extra information added to them when a definition is emitted. DIBuilder enforces this by refusing to let you re-set the retainedNodes of definition subprograms.

If there's no performance cost as a result of this patch, then it seems fine to go in to me., but is there a functional reason in a later patch that makes it necessary? Other folks might believe it's valuable to preserve the assumption about how debug-info is constructed within DIBuilder (CC @adrian-prantl @dwblaikie ).

@dwblaikie
Copy link
Collaborator

yeah, can't say I have any recollection of why function declaration would be in a "retained types" list - probably incidental/not especially intentional (we didn't make declarations for functions, except member functions, until "recently" when call site debug info was added). I guess it's worth a go to add them to all the subprograms.

@dzhidzhoev
Copy link
Member Author

dzhidzhoev commented May 22, 2025

I suppose what this is encoding is a (subtle?) assumption about how definition-subprograms are modelled: they're created once and are immutable. Wheras declaration-functions can be forwarded declared and then have extra information added to them when a definition is emitted.

It is possible.

By the way, I've noticed that (declaration) SPs created with the DIBuilder::CreateMethod call in CGDebugInfo::CreateCXXMemberFunction are not finalized anywhere in clang (at least it's true for get_a/get_b in clang/test/CodeGenCXX/debug-info-local-types.cpp from #119001). I think clang expected DIBuilder to finalize everything before https://reviews.llvm.org/D33704, where it turned out that clang needed some subprograms to be finalized before the finalization of the whole CGDebugInfo instance. It seems to me that manual finalizeSubprogram() calls were added sporadically, not by following declaration-vs-definition logic.

If there's no performance cost as a result of this patch, then it seems fine to go in to me.

Performance cost in terms of compilation time?

but is there a functional reason in a later patch that makes it necessary?

I've touched it briefly here #119001 (comment). I've noticed that the mentioned PR doesn't call finalizeSubprogram() for the created declaration SP. With the call added, the test output changes: declaration DISubprograms have their DICompositeTypes in the retainedNodes field.
We can't just call finalizeSubprogram() right after CreateSubprogram(), as we return from the function before the local types are created. And we can't attach a declaration subprogram to the corresponding Clang AST declaration, as it may not exist (if I get it right).
We could start tracking these declarations in CGDebugInfo class, but I'm curious why not to do that on the level of DIBuilder for all SPs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category debuginfo llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants