-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Improve debug names index fetching global variables performance #70231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve debug names index fetching global variables performance #70231
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just rename the "has_match_cu" to "cu_matches" and this is good to go.
@@ -130,6 +130,17 @@ void DebugNamesDWARFIndex::GetGlobalVariables( | |||
uint64_t cu_offset = cu.GetOffset(); | |||
bool found_entry_for_cu = false; | |||
for (const DebugNames::NameIndex &ni: *m_debug_names_up) { | |||
// Check if this name index contains an entry for the given CU. | |||
bool has_match_cu = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this maybe to "cu_matches".
✅ With the latest revision this PR passed the C/C++ code formatter. |
@llvm/pr-subscribers-lldb Author: None (jeffreytan81) ChangesWhile using dwarf5 This patch improves the performance by using the compile units list to filter first before checking index entries. This significantly improves the performance (drops from 10 seconds => under 1 second) in the split dwarf situation because each compile unit has its own named index. Full diff: https://github.com/llvm/llvm-project/pull/70231.diff 1 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 292ea2806c59dc7..4fc3866a3b608fd 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -129,8 +129,19 @@ void DebugNamesDWARFIndex::GetGlobalVariables(
DWARFUnit &cu, llvm::function_ref<bool(DWARFDIE die)> callback) {
uint64_t cu_offset = cu.GetOffset();
bool found_entry_for_cu = false;
- for (const DebugNames::NameIndex &ni: *m_debug_names_up) {
- for (DebugNames::NameTableEntry nte: ni) {
+ for (const DebugNames::NameIndex &ni : *m_debug_names_up) {
+ // Check if this name index contains an entry for the given CU.
+ bool cu_matches = false;
+ for (uint32_t i = 0; i < ni.getCUCount(); ++i) {
+ if (ni.getCUOffset(i) == cu_offset) {
+ cu_matches = true;
+ break;
+ }
+ }
+ if (!cu_matches)
+ continue;
+
+ for (DebugNames::NameTableEntry nte : ni) {
uint64_t entry_offset = nte.getEntryOffset();
llvm::Expected<DebugNames::Entry> entry_or = ni.getEntry(&entry_offset);
for (; entry_or; entry_or = ni.getEntry(&entry_offset)) {
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch looks fine to me in idea, but next time please leave your review up for longer so others have time to take a look as well.
bool cu_matches = false; | ||
for (uint32_t i = 0; i < ni.getCUCount(); ++i) { | ||
if (ni.getCUOffset(i) == cu_offset) { | ||
cu_matches = true; | ||
break; | ||
} | ||
} | ||
if (!cu_matches) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be great if we could use some kind of find_if
instead of manually walking the CU Offsets.
for (uint32_t i = 0; i < ni.getCUCount(); ++i) { | ||
if (ni.getCUOffset(i) == cu_offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If NameIndex exposed the CUOffsets as a range (which seems pretty easy/reasonable for it to do - ah, because it requires potentially applying relocations it'd probably require a custom iterator - maybe a mapped iterator would be adequate & easy to do) then this could be written as:
if (llvm::none_of(ni.CUOffsets(), [&](uint64_t off) { return off == cu_offset; }))
continue;
I /think/ CUOffsets()
would look something roughly like this:
auto CUOffsets() const {
assert(TU < Hdr.CompUnitCount);
const unsigned SectionOffsetSize = dwarf::getDwarfOffsetByteSize(Hdr.Format);
uint64_t Offset = CUsBase + SectionOffsetSize * CU;
auto R = /* Guess we need some sort of generator to produce the values
[CUsBase, CUsBase+SectionOffsetSize*Hdr.CompUnitCount) in SectionOffsetSize increments...
some enhancement to llvm::seq that takes a stride size would be suitable */
return llvm::map_range(llvm::seq(CUsBase,
[&](uint64_t Offset) {
return Section.AccelSection.getRelocatedValue(SectionOffsetSize, &Offset);
});
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, also, if you kept the result (more like a llvm::find_if
as @bulbazord was suggesting, rather than my llvm::none_of
here) of this search, you could save a small amount of time (no need to indirect through the index and reapply relocations to get the CU offset) by using getCUInedx()
and comparing that to the index you would've found in this search - down on line 139/150
Sounds good. Will keep the PR alive at least half day or one day before merging in future. |
While using dwarf5
.debug_names
in internal large targets, we noticed a performance issue (around 10 seconds delay) whilelldb-vscode
tries to showscopes
for a compile unit. Profiling shows the bottleneck is insideDebugNamesDWARFIndex::GetGlobalVariables
which linearly search all index entries belongs to a compile unit.This patch improves the performance by using the compile units list to filter first before checking index entries. This significantly improves the performance (drops from 10 seconds => under 1 second) in the split dwarf situation because each compile unit has its own named index.