Skip to content

Improve debug names index fetching global variables performance #70231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 25, 2023

Conversation

jeffreytan81
Copy link
Contributor

While using dwarf5 .debug_names in internal large targets, we noticed a performance issue (around 10 seconds delay) while lldb-vscode tries to show scopes for a compile unit. Profiling shows the bottleneck is inside DebugNamesDWARFIndex::GetGlobalVariables which linearly search all index entries belongs to a compile unit.

This patch improves the performance by using the compile units list to filter first before checking index entries. This significantly improves the performance (drops from 10 seconds => under 1 second) in the split dwarf situation because each compile unit has its own named index.

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just rename the "has_match_cu" to "cu_matches" and this is good to go.

@@ -130,6 +130,17 @@ void DebugNamesDWARFIndex::GetGlobalVariables(
uint64_t cu_offset = cu.GetOffset();
bool found_entry_for_cu = false;
for (const DebugNames::NameIndex &ni: *m_debug_names_up) {
// Check if this name index contains an entry for the given CU.
bool has_match_cu = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename this maybe to "cu_matches".

@github-actions
Copy link

github-actions bot commented Oct 25, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@jeffreytan81 jeffreytan81 marked this pull request as ready for review October 25, 2023 18:45
@llvmbot llvmbot added the lldb label Oct 25, 2023
@jeffreytan81 jeffreytan81 merged commit d4e6e40 into llvm:main Oct 25, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 25, 2023

@llvm/pr-subscribers-lldb

Author: None (jeffreytan81)

Changes

While using dwarf5 .debug_names in internal large targets, we noticed a performance issue (around 10 seconds delay) while lldb-vscode tries to show scopes for a compile unit. Profiling shows the bottleneck is inside DebugNamesDWARFIndex::GetGlobalVariables which linearly search all index entries belongs to a compile unit.

This patch improves the performance by using the compile units list to filter first before checking index entries. This significantly improves the performance (drops from 10 seconds => under 1 second) in the split dwarf situation because each compile unit has its own named index.


Full diff: https://github.com/llvm/llvm-project/pull/70231.diff

1 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp (+13-2)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 292ea2806c59dc7..4fc3866a3b608fd 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -129,8 +129,19 @@ void DebugNamesDWARFIndex::GetGlobalVariables(
     DWARFUnit &cu, llvm::function_ref<bool(DWARFDIE die)> callback) {
   uint64_t cu_offset = cu.GetOffset();
   bool found_entry_for_cu = false;
-  for (const DebugNames::NameIndex &ni: *m_debug_names_up) {
-    for (DebugNames::NameTableEntry nte: ni) {
+  for (const DebugNames::NameIndex &ni : *m_debug_names_up) {
+    // Check if this name index contains an entry for the given CU.
+    bool cu_matches = false;
+    for (uint32_t i = 0; i < ni.getCUCount(); ++i) {
+      if (ni.getCUOffset(i) == cu_offset) {
+        cu_matches = true;
+        break;
+      }
+    }
+    if (!cu_matches)
+      continue;
+
+    for (DebugNames::NameTableEntry nte : ni) {
       uint64_t entry_offset = nte.getEntryOffset();
       llvm::Expected<DebugNames::Entry> entry_or = ni.getEntry(&entry_offset);
       for (; entry_or; entry_or = ni.getEntry(&entry_offset)) {

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch looks fine to me in idea, but next time please leave your review up for longer so others have time to take a look as well.

Comment on lines +134 to +142
bool cu_matches = false;
for (uint32_t i = 0; i < ni.getCUCount(); ++i) {
if (ni.getCUOffset(i) == cu_offset) {
cu_matches = true;
break;
}
}
if (!cu_matches)
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be great if we could use some kind of find_if instead of manually walking the CU Offsets.

Comment on lines +135 to +136
for (uint32_t i = 0; i < ni.getCUCount(); ++i) {
if (ni.getCUOffset(i) == cu_offset) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If NameIndex exposed the CUOffsets as a range (which seems pretty easy/reasonable for it to do - ah, because it requires potentially applying relocations it'd probably require a custom iterator - maybe a mapped iterator would be adequate & easy to do) then this could be written as:

if (llvm::none_of(ni.CUOffsets(), [&](uint64_t off) { return off == cu_offset; }))
  continue;

I /think/ CUOffsets() would look something roughly like this:

auto CUOffsets() const {
  assert(TU < Hdr.CompUnitCount);
  const unsigned SectionOffsetSize = dwarf::getDwarfOffsetByteSize(Hdr.Format);
  uint64_t Offset = CUsBase + SectionOffsetSize * CU;
  auto R = /* Guess we need some sort of generator to produce the values 
    [CUsBase, CUsBase+SectionOffsetSize*Hdr.CompUnitCount) in SectionOffsetSize increments... 
    some enhancement to llvm::seq that takes a stride size would be suitable */
  return llvm::map_range(llvm::seq(CUsBase, 
      [&](uint64_t Offset) {
        return Section.AccelSection.getRelocatedValue(SectionOffsetSize, &Offset);
      });
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, also, if you kept the result (more like a llvm::find_if as @bulbazord was suggesting, rather than my llvm::none_of here) of this search, you could save a small amount of time (no need to indirect through the index and reapply relocations to get the CU offset) by using getCUInedx() and comparing that to the index you would've found in this search - down on line 139/150

@jeffreytan81
Copy link
Contributor Author

Sounds good. Will keep the PR alive at least half day or one day before merging in future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants