Skip to content

[DWARF] Emit a minimal line-table for totally empty functions #107267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 9, 2024

Conversation

jmorse
Copy link
Member

@jmorse jmorse commented Sep 4, 2024

(Ignore base two patches, this PR is about the tip-most commit, b72309c).

If a function has no source locations in it at all, emitInitialLocDirective won't fire at all, and no line table will be produced for the function. This seems pretty reasonable given that there's no data to produce; on the other hand given that there's a DISubprogram available, we can throw the end user a bone and cover the function with the scope-line source-location. This means if there's a crash then they can localise to that function, or put a breakpoint on that function. The new test, and the modified XCOFF test, both exercise this code path.

Does this scenario actually happen in real-life? I don't have a real example of it, but almost anything can come out of heavy LTOing. At the very least this will provide a safety-net, even if it occurs vanishingly infrequently.

Seemingly this goes back to fd07a2a in 2015 -- I anticipate that back
then the metadata layout was radically different. But nowadays at least, we
can just directly look up the subprogram.
Currently, we identify the end of the prologue as being "the instruction
that first has *this* DebugLoc". It works well enough, but I feel
identifying a position in a function is best communicated by a
MachineInstr. Plus, I've got some patches coming that depend upon this.
In degenerate but legal inputs, we can have functions that have no source
locations at all -- all the DebugLocs attached to instructions are empty.
LLVM didn't produce any source location for the function; with this patch
it will at least emit the function-scope source location. Demonstrated by
empty-line-info.ll

The XCOFF test modified has similar symptoms -- with this patch, the size
of the ".dwline" section grows a bit, thus shifting some of the file
internal offsets, which I've updated.
@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2024

@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-debuginfo

Author: Jeremy Morse (jmorse)

Changes

(Ignore base two patches, this PR is about the tip-most commit, b72309c).

If a function has no source locations in it at all, emitInitialLocDirective won't fire at all, and no line table will be produced for the function. This seems pretty reasonable given that there's no data to produce; on the other hand given that there's a DISubprogram available, we can throw the end user a bone and cover the function with the scope-line source-location. This means if there's a crash then they can localise to that function, or put a breakpoint on that function. The new test, and the modified XCOFF test, both exercise this code path.

Does this scenario actually happen in real-life? I don't have a real example of it, but almost anything can come out of heavy LTOing. At the very least this will provide a safety-net, even if it occurs vanishingly infrequently.


Full diff: https://github.com/llvm/llvm-project/pull/107267.diff

5 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/DebugHandlerBase.h (+1-1)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+26-26)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h (+4-2)
  • (modified) llvm/test/CodeGen/PowerPC/aix-xcoff-exception-section-debug.ll (+4-4)
  • (added) llvm/test/DebugInfo/X86/empty-line-info.ll (+46)
diff --git a/llvm/include/llvm/CodeGen/DebugHandlerBase.h b/llvm/include/llvm/CodeGen/DebugHandlerBase.h
index 85046c200ff9b1..d39e7e68cb2558 100644
--- a/llvm/include/llvm/CodeGen/DebugHandlerBase.h
+++ b/llvm/include/llvm/CodeGen/DebugHandlerBase.h
@@ -74,7 +74,7 @@ class DebugHandlerBase {
 
   /// This location indicates end of function prologue and beginning of
   /// function body.
-  DebugLoc PrologEndLoc;
+  const MachineInstr *PrologEndLoc;
 
   /// This block includes epilogue instructions.
   const MachineBasicBlock *EpilogBeginBlock = nullptr;
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index f111b4aea06f1b..e9649f9ff81658 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2114,9 +2114,9 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
   // (The new location might be an explicit line 0, which we do emit.)
   if (DL.getLine() == 0 && LastAsmLine == 0)
     return;
-  if (DL == PrologEndLoc) {
+  if (MI == PrologEndLoc) {
     Flags |= DWARF2_FLAG_PROLOGUE_END | DWARF2_FLAG_IS_STMT;
-    PrologEndLoc = DebugLoc();
+    PrologEndLoc = nullptr;
   }
   // If the line changed, we call that a new statement; unless we went to
   // line 0 and came back, in which case it is not a new statement.
@@ -2132,10 +2132,11 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
     PrevInstLoc = DL;
 }
 
-static std::pair<DebugLoc, bool> findPrologueEndLoc(const MachineFunction *MF) {
+static std::pair<const MachineInstr *, bool>
+findPrologueEndLoc(const MachineFunction *MF) {
   // First known non-DBG_VALUE and non-frame setup location marks
   // the beginning of the function body.
-  DebugLoc LineZeroLoc;
+  const MachineInstr *LineZeroLoc = nullptr;
   const Function &F = MF->getFunction();
 
   // Some instructions may be inserted into prologue after this function. Must
@@ -2152,9 +2153,9 @@ static std::pair<DebugLoc, bool> findPrologueEndLoc(const MachineFunction *MF) {
           // meaningful breakpoint. If none is found, return the first
           // location after the frame setup.
           if (MI.getDebugLoc().getLine())
-            return std::make_pair(MI.getDebugLoc(), IsEmptyPrologue);
+            return std::make_pair(&MI, IsEmptyPrologue);
 
-          LineZeroLoc = MI.getDebugLoc();
+          LineZeroLoc = &MI;
         }
         IsEmptyPrologue = false;
       }
@@ -2185,30 +2186,29 @@ static void recordSourceLine(AsmPrinter &Asm, unsigned Line, unsigned Col,
                                          Discriminator, Fn);
 }
 
-DebugLoc DwarfDebug::emitInitialLocDirective(const MachineFunction &MF,
-                                             unsigned CUID) {
-  std::pair<DebugLoc, bool> PrologEnd = findPrologueEndLoc(&MF);
-  DebugLoc PrologEndLoc = PrologEnd.first;
+const MachineInstr *
+DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) {
+  std::pair<const MachineInstr *, bool> PrologEnd = findPrologueEndLoc(&MF);
+  const MachineInstr *PrologEndLoc = PrologEnd.first;
   bool IsEmptyPrologue = PrologEnd.second;
 
-  // Get beginning of function.
-  if (PrologEndLoc) {
-    // If the prolog is empty, no need to generate scope line for the proc.
-    if (IsEmptyPrologue)
+  // If the prolog is empty, no need to generate scope line for the proc.
+  if (IsEmptyPrologue)
+    // In degenerate cases, we can have functions with no source locations
+    // at all. These want a scope line, to avoid a totally empty function.
+    // Thus, only skip scope line if there's location to place prologue_end.
+    if (PrologEndLoc)
       return PrologEndLoc;
 
-    // Ensure the compile unit is created if the function is called before
-    // beginFunction().
-    (void)getOrCreateDwarfCompileUnit(
-        MF.getFunction().getSubprogram()->getUnit());
-    // We'd like to list the prologue as "not statements" but GDB behaves
-    // poorly if we do that. Revisit this with caution/GDB (7.5+) testing.
-    const DISubprogram *SP = PrologEndLoc->getInlinedAtScope()->getSubprogram();
-    ::recordSourceLine(*Asm, SP->getScopeLine(), 0, SP, DWARF2_FLAG_IS_STMT,
-                       CUID, getDwarfVersion(), getUnits());
-    return PrologEndLoc;
-  }
-  return DebugLoc();
+  // Ensure the compile unit is created if the function is called before
+  // beginFunction().
+  DISubprogram *SP = MF.getFunction().getSubprogram();
+  (void)getOrCreateDwarfCompileUnit(SP->getUnit());
+  // We'd like to list the prologue as "not statements" but GDB behaves
+  // poorly if we do that. Revisit this with caution/GDB (7.5+) testing.
+  ::recordSourceLine(*Asm, SP->getScopeLine(), 0, SP, DWARF2_FLAG_IS_STMT,
+                     CUID, getDwarfVersion(), getUnits());
+  return PrologEndLoc;
 }
 
 // Gather pre-function debug information.  Assumes being called immediately
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
index 6e379396ea079e..19f5b677bb8d06 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
@@ -724,8 +724,10 @@ class DwarfDebug : public DebugHandlerBase {
   /// Emit all Dwarf sections that should come after the content.
   void endModule() override;
 
-  /// Emits inital debug location directive.
-  DebugLoc emitInitialLocDirective(const MachineFunction &MF, unsigned CUID);
+  /// Emits inital debug location directive. Returns instruction at which
+  /// the function prologue ends.
+  const MachineInstr *emitInitialLocDirective(const MachineFunction &MF,
+                                              unsigned CUID);
 
   /// Process beginning of an instruction.
   void beginInstruction(const MachineInstr *MI) override;
diff --git a/llvm/test/CodeGen/PowerPC/aix-xcoff-exception-section-debug.ll b/llvm/test/CodeGen/PowerPC/aix-xcoff-exception-section-debug.ll
index bc6e9f35019828..9c5d560e27f91e 100644
--- a/llvm/test/CodeGen/PowerPC/aix-xcoff-exception-section-debug.ll
+++ b/llvm/test/CodeGen/PowerPC/aix-xcoff-exception-section-debug.ll
@@ -40,7 +40,7 @@ define dso_local void @test__trap_annotation_debug(i32 %a) !dbg !4 {
 ; SYMS32-NEXT:      NumberOfAuxEntries: 2
 ; SYMS32-NEXT:      Function Auxiliary Entry {
 ; SYMS32-NEXT:        Index: [[#IND+1]]
-; SYMS32-NEXT:        OffsetToExceptionTable: 0x2A8
+; SYMS32-NEXT:        OffsetToExceptionTable: 0x2B8
 ; SYMS32-NEXT:        SizeOfFunction: 0xC
 ; SYMS32-NEXT:        PointerToLineNum: 0x0
 ; SYMS32-NEXT:        SymbolIndexOfNextBeyond: [[#IND+3]] 
@@ -67,7 +67,7 @@ define dso_local void @test__trap_annotation_debug(i32 %a) !dbg !4 {
 ; SYMS32-NEXT:      NumberOfAuxEntries: 2
 ; SYMS32-NEXT:      Function Auxiliary Entry {
 ; SYMS32-NEXT:        Index: [[#IND+4]]
-; SYMS32-NEXT:        OffsetToExceptionTable: 0x2B4
+; SYMS32-NEXT:        OffsetToExceptionTable: 0x2C4
 ; SYMS32-NEXT:        SizeOfFunction: 0x34
 ; SYMS32-NEXT:        PointerToLineNum: 0x0
 ; SYMS32-NEXT:        SymbolIndexOfNextBeyond: [[#IND+6]]
@@ -93,7 +93,7 @@ define dso_local void @test__trap_annotation_debug(i32 %a) !dbg !4 {
 ; SYMS64-NEXT:      NumberOfAuxEntries: 3
 ; SYMS64-NEXT:      Exception Auxiliary Entry {
 ; SYMS64-NEXT:        Index: [[#IND+1]]
-; SYMS64-NEXT:        OffsetToExceptionTable: 0x398
+; SYMS64-NEXT:        OffsetToExceptionTable: 0x3AC
 ; SYMS64-NEXT:        SizeOfFunction: 0x18
 ; SYMS64-NEXT:        SymbolIndexOfNextBeyond: [[#IND+4]]
 ; SYMS64-NEXT:        Auxiliary Type: AUX_EXCEPT (0xFF)
@@ -126,7 +126,7 @@ define dso_local void @test__trap_annotation_debug(i32 %a) !dbg !4 {
 ; SYMS64-NEXT:      NumberOfAuxEntries: 3
 ; SYMS64-NEXT:      Exception Auxiliary Entry {
 ; SYMS64-NEXT:        Index: [[#IND+5]]
-; SYMS64-NEXT:        OffsetToExceptionTable: 0x3AC
+; SYMS64-NEXT:        OffsetToExceptionTable: 0x3C0
 ; SYMS64-NEXT:        SizeOfFunction: 0x68
 ; SYMS64-NEXT:        SymbolIndexOfNextBeyond: [[#IND+8]]
 ; SYMS64-NEXT:        Auxiliary Type: AUX_EXCEPT (0xFF)
diff --git a/llvm/test/DebugInfo/X86/empty-line-info.ll b/llvm/test/DebugInfo/X86/empty-line-info.ll
new file mode 100644
index 00000000000000..68cf08f18ffeb8
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/empty-line-info.ll
@@ -0,0 +1,46 @@
+; RUN: llc %s -o - -mtriple=x86_64-unknown-linux-gnu | FileCheck %s
+
+;; Test that, even though there are no source locations attached to the foo
+;; function, we still give it the start-of-function source location of the
+;; definition line. Otherwise, this function would have no entry in the
+;; line table at all.
+
+; CHECK-LABEL: foo:
+; CHECK-NEXT:   .Lfunc_begin0:
+; CHECK-NEXT:   .file   0 "." "foobar.c"
+; CHECK-NEXT:   .loc    0 1 0
+
+define dso_local noundef i32 @foo(ptr nocapture noundef writeonly %bar) local_unnamed_addr !dbg !10 {
+entry:
+  store i32 0, ptr %bar, align 4
+  ret i32 0
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8}
+!llvm.ident = !{!9}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "foobar.c", directory: ".")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"PIE Level", i32 2}
+!7 = !{i32 7, !"uwtable", i32 2}
+!8 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+!9 = !{!"clang"}
+!10 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !11, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !15)
+!11 = !DISubroutineType(types: !12)
+!12 = !{!13, !14}
+!13 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!14 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !13, size: 64)
+!15 = !{!16}
+!16 = !DILocalVariable(name: "bar", arg: 1, scope: !10, file: !1, line: 1, type: !14)
+!17 = !DILocation(line: 0, scope: !10)
+!18 = !DILocation(line: 2, column: 8, scope: !10)
+!19 = !{!20, !20, i64 0}
+!20 = !{!"int", !21, i64 0}
+!21 = !{!"omnipotent char", !22, i64 0}
+!22 = !{!"Simple C/C++ TBAA"}
+!23 = !DILocation(line: 3, column: 3, scope: !10)

Copy link

github-actions bot commented Sep 4, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 11040560ba30381ed47c3089a2562a41b00dbb4b fd7fc3a2cd79a332d1c6b26958840b855af8f28f --extensions cpp -- llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 148b620c2b..65afbdf81d 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2205,8 +2205,8 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) {
   (void)getOrCreateDwarfCompileUnit(SP->getUnit());
   // We'd like to list the prologue as "not statements" but GDB behaves
   // poorly if we do that. Revisit this with caution/GDB (7.5+) testing.
-  ::recordSourceLine(*Asm, SP->getScopeLine(), 0, SP, DWARF2_FLAG_IS_STMT,
-                     CUID, getDwarfVersion(), getUnits());
+  ::recordSourceLine(*Asm, SP->getScopeLine(), 0, SP, DWARF2_FLAG_IS_STMT, CUID,
+                     getDwarfVersion(), getUnits());
   return PrologEndLoc;
 }
 

@dwblaikie
Copy link
Collaborator

Yeah, I guess - I don't mind the general concept, fwiw.

Copy link
Collaborator

@chenzheng1030 chenzheng1030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to XCOFF case and the debug line table LGTM. Better to wait for another look : )

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too

Comment on lines 27 to 31
!4 = !{i32 1, !"wchar_size", i32 4}
!5 = !{i32 8, !"PIC Level", i32 2}
!6 = !{i32 7, !"PIE Level", i32 2}
!7 = !{i32 7, !"uwtable", i32 2}
!8 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we can remove these and their uses

Comment on lines 42 to 45
!19 = !{!20, !20, i64 0}
!20 = !{!"int", !21, i64 0}
!21 = !{!"omnipotent char", !22, i64 0}
!22 = !{!"Simple C/C++ TBAA"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can remove these too

@jmorse
Copy link
Member Author

jmorse commented Sep 5, 2024

Rebased, then discovered via another test that the scope-line is going to be terminated by the second basic block in the program. See the new additions to the test I'm adding, we'll by default put in an explicit line-zero location, specifically to avoid the scope-line running through the whole program.

I'd missed that this would happen; I still feel it'd be worthwhile to emit this scope line on the entry block, at the very least so that people can set a breakpoint on the function. (Re-requesting review in case this shifts peoples opinions).

@jmorse jmorse merged commit 7a930ce into llvm:main Sep 9, 2024
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants