-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[lldb] Fix inline function resolution for discontinuous functions #116777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The problem here is the assumption that the entire function will be placed in a single section. This will ~never be the case for a discontinuous function, as the point of splitting the function is to let the linker group parts of the function according to their "hotness". The fix is to change the offset computation to use file addresses instead.
@llvm/pr-subscribers-lldb Author: Pavel Labath (labath) ChangesThe problem here is the assumption that the entire function will be placed in a single section. This will ~never be the case for a discontinuous function, as the point of splitting the function is to let the linker group parts of the function according to their "hotness". The fix is to change the offset computation to use file addresses instead. Full diff: https://github.com/llvm/llvm-project/pull/116777.diff 3 Files Affected:
diff --git a/lldb/source/Symbol/Block.cpp b/lldb/source/Symbol/Block.cpp
index 5c7772a6db780d..8746a25e3fad5a 100644
--- a/lldb/source/Symbol/Block.cpp
+++ b/lldb/source/Symbol/Block.cpp
@@ -252,19 +252,20 @@ bool Block::GetRangeContainingAddress(const Address &addr,
Function *function = CalculateSymbolContextFunction();
if (function) {
const AddressRange &func_range = function->GetAddressRange();
- if (addr.GetSection() == func_range.GetBaseAddress().GetSection()) {
- const addr_t addr_offset = addr.GetOffset();
- const addr_t func_offset = func_range.GetBaseAddress().GetOffset();
- if (addr_offset >= func_offset &&
- addr_offset < func_offset + func_range.GetByteSize()) {
- addr_t offset = addr_offset - func_offset;
+ if (addr.GetModule() == func_range.GetBaseAddress().GetModule()) {
+ const addr_t file_addr = addr.GetFileAddress();
+ const addr_t func_file_addr =
+ func_range.GetBaseAddress().GetFileAddress();
+ if (file_addr >= func_file_addr &&
+ file_addr < func_file_addr + func_range.GetByteSize()) {
+ addr_t offset = file_addr - func_file_addr;
const Range *range_ptr = m_ranges.FindEntryThatContains(offset);
if (range_ptr) {
- range.GetBaseAddress() = func_range.GetBaseAddress();
- range.GetBaseAddress().SetOffset(func_offset +
- range_ptr->GetRangeBase());
+ range.GetBaseAddress() =
+ Address(func_file_addr + range_ptr->GetRangeBase(),
+ addr.GetModule()->GetSectionList());
range.SetByteSize(range_ptr->GetByteSize());
return true;
}
diff --git a/lldb/source/Symbol/SymbolContext.cpp b/lldb/source/Symbol/SymbolContext.cpp
index de083e81206e2a..1a291ca3c0ea7a 100644
--- a/lldb/source/Symbol/SymbolContext.cpp
+++ b/lldb/source/Symbol/SymbolContext.cpp
@@ -102,10 +102,11 @@ bool SymbolContext::DumpStopContext(
s->PutCStringColorHighlighted(name.GetStringRef(), settings);
}
- if (addr.IsValid()) {
+ if (addr_t file_addr = addr.GetFileAddress();
+ file_addr != LLDB_INVALID_ADDRESS) {
const addr_t function_offset =
- addr.GetOffset() -
- function->GetAddressRange().GetBaseAddress().GetOffset();
+ file_addr -
+ function->GetAddressRange().GetBaseAddress().GetFileAddress();
if (!show_function_name) {
// Print +offset even if offset is 0
dumped_something = true;
@@ -126,7 +127,8 @@ bool SymbolContext::DumpStopContext(
lldb_private::AddressRange block_range;
if (inlined_block->GetRangeContainingAddress(addr, block_range)) {
const addr_t inlined_function_offset =
- addr.GetOffset() - block_range.GetBaseAddress().GetOffset();
+ addr.GetFileAddress() -
+ block_range.GetBaseAddress().GetFileAddress();
if (inlined_function_offset) {
s->Printf(" + %" PRIu64, inlined_function_offset);
}
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s b/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s
new file mode 100644
index 00000000000000..399f4e4db5b2f1
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s
@@ -0,0 +1,167 @@
+## Test that inline function resolution works when the function has been split
+## into multiple discontinuous parts (and those parts are placed in different
+## sections)
+
+# RUN: llvm-mc -triple x86_64-pc-linux -filetype=obj %s -o %t
+# RUN: %lldb %t -o "image lookup -v -n look_me_up" -o exit | FileCheck %s
+
+# CHECK: 1 match found in {{.*}}
+# CHECK: Summary: {{.*}}`foo + 6 [inlined] foo_inl + 1
+# CHECK-NEXT: {{.*}}`foo + 5
+# CHECK: Blocks: id = {{.*}}, ranges = [0x00000000-0x00000003)[0x00000004-0x00000008)
+# CHECK-NEXT: id = {{.*}}, ranges = [0x00000001-0x00000002)[0x00000005-0x00000007), name = "foo_inl"
+
+ .text
+
+ .type foo,@function
+foo:
+ nop
+.Lfoo_inl:
+ nop
+.Lfoo_inl_end:
+ nop
+.Lfoo_end:
+ .size foo, .Lfoo_end-foo
+
+bar:
+ nop
+.Lbar_end:
+ .size bar, .Lbar_end-bar
+
+ .section .text.__part1,"ax",@progbits
+foo.__part.1:
+ nop
+.Lfoo_inl.__part.1:
+ nop
+ .type look_me_up,@function
+look_me_up:
+ nop
+.Lfoo_inl.__part.1_end:
+ nop
+.Lfoo.__part.1_end:
+ .size foo.__part.1, .Lfoo.__part.1_end-foo.__part.1
+
+
+ .section .debug_abbrev,"",@progbits
+ .byte 1 # Abbreviation Code
+ .byte 17 # DW_TAG_compile_unit
+ .byte 1 # DW_CHILDREN_yes
+ .byte 37 # DW_AT_producer
+ .byte 8 # DW_FORM_string
+ .byte 19 # DW_AT_language
+ .byte 5 # DW_FORM_data2
+ .byte 17 # DW_AT_low_pc
+ .byte 1 # DW_FORM_addr
+ .byte 85 # DW_AT_ranges
+ .byte 35 # DW_FORM_rnglistx
+ .byte 116 # DW_AT_rnglists_base
+ .byte 23 # DW_FORM_sec_offset
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 2 # Abbreviation Code
+ .byte 46 # DW_TAG_subprogram
+ .byte 0 # DW_CHILDREN_no
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 3 # Abbreviation Code
+ .byte 46 # DW_TAG_subprogram
+ .byte 1 # DW_CHILDREN_yes
+ .byte 85 # DW_AT_ranges
+ .byte 35 # DW_FORM_rnglistx
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 4 # Abbreviation Code
+ .byte 29 # DW_TAG_inlined_subroutine
+ .byte 0 # DW_CHILDREN_no
+ .byte 85 # DW_AT_ranges
+ .byte 35 # DW_FORM_rnglistx
+ .byte 49 # DW_AT_abstract_origin
+ .byte 19 # DW_FORM_ref4
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 0 # EOM(3)
+
+ .section .debug_info,"",@progbits
+.Lcu_begin0:
+ .long .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
+.Ldebug_info_start0:
+ .short 5 # DWARF version number
+ .byte 1 # DWARF Unit Type
+ .byte 8 # Address Size (in bytes)
+ .long .debug_abbrev # Offset Into Abbrev. Section
+ .byte 1 # Abbrev DW_TAG_compile_unit
+ .asciz "Hand-written DWARF" # DW_AT_producer
+ .short 29 # DW_AT_language
+ .quad 0 # DW_AT_low_pc
+ .byte 1 # DW_AT_ranges
+ .long .Lrnglists_table_base0 # DW_AT_rnglists_base
+
+ .byte 3 # Abbrev DW_TAG_subprogram
+ .byte 2 # DW_AT_ranges
+ .asciz "bar" # DW_AT_name
+ .byte 0 # End Of Children Mark
+
+.Lfoo_inl_die:
+ .byte 2 # Abbrev DW_TAG_subprogram
+ .asciz "foo_inl" # DW_AT_name
+
+ .byte 3 # Abbrev DW_TAG_subprogram
+ .byte 0 # DW_AT_ranges
+ .asciz "foo" # DW_AT_name
+ .byte 4 # Abbrev DW_TAG_inlined_subroutine
+ .byte 3 # DW_AT_ranges
+ .long .Lfoo_inl_die-.Lcu_begin0 # DW_AT_abstract_origin
+ .byte 0 # End Of Children Mark
+
+ .byte 0 # End Of Children Mark
+.Ldebug_info_end0:
+
+ .section .debug_rnglists,"",@progbits
+ .long .Ldebug_list_header_end0-.Ldebug_list_header_start0 # Length
+.Ldebug_list_header_start0:
+ .short 5 # Version
+ .byte 8 # Address size
+ .byte 0 # Segment selector size
+ .long 4 # Offset entry count
+.Lrnglists_table_base0:
+ .long .Ldebug_ranges0-.Lrnglists_table_base0
+ .long .Ldebug_ranges1-.Lrnglists_table_base0
+ .long .Ldebug_ranges2-.Lrnglists_table_base0
+ .long .Ldebug_ranges3-.Lrnglists_table_base0
+.Ldebug_ranges0:
+ .byte 6 # DW_RLE_start_end
+ .quad foo
+ .quad .Lfoo_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.1
+ .quad .Lfoo.__part.1_end
+ .byte 0 # DW_RLE_end_of_list
+.Ldebug_ranges1:
+ .byte 6 # DW_RLE_start_end
+ .quad bar
+ .quad .Lbar_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.1
+ .quad .Lfoo.__part.1_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo
+ .quad .Lfoo_end
+ .byte 0 # DW_RLE_end_of_list
+.Ldebug_ranges2:
+ .byte 6 # DW_RLE_start_end
+ .quad bar
+ .quad .Lbar_end
+ .byte 0 # DW_RLE_end_of_list
+.Ldebug_ranges3:
+ .byte 6 # DW_RLE_start_end
+ .quad .Lfoo_inl
+ .quad .Lfoo_inl_end
+ .byte 6 # DW_RLE_start_end
+ .quad .Lfoo_inl.__part.1
+ .quad .Lfoo_inl.__part.1_end
+ .byte 0 # DW_RLE_end_of_list
+.Ldebug_list_header_end0:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, I read the patch quickly yesterday without reading the description and was a little confused. Read the description and seeing the test case, I see what this is doing. Looks good to me. I didn't realize the discontiguous regions would be split into different sections with this codegen pattern.
Yeah, in my previous patches, the code was all in one section as it wasn't relevant for the test, but the way this really works is that every part of the function gets its own subsection (on MachO, I guess you'd use subsections for that). That way the linker doesn't need to do anything special -- for all it cares, the individual chunks are just separate functions (with strange calling conventions). |
…nctions This is a followup to llvm#122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in llvm#116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).
…nctions (#124931) This is a followup to #122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in #116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).
…nctions (llvm#124931) This is a followup to llvm#122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in llvm#116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).
…nctions (llvm#124931) This is a followup to llvm#122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in llvm#116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).
…nctions (llvm#124931) This is a followup to llvm#122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in llvm#116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).
Continuing the theme from llvm#116777 and llvm#124931, this patch ensures we compute the correct address when a functions is spread across multiple sections. Due to this, it's not sufficient to adjust the offset in the section+offset pair (Address::Slide). We must actually slide the file offset and then recompute the section using the result.
#137955) Continuing the theme from #116777 and #124931, this patch ensures we compute the correct address when a functions is spread across multiple sections. Due to this, it's not sufficient to adjust the offset in the section+offset pair (Address::Slide). We must actually slide the file offset and then recompute the section using the result. I found this out due to a failure to disassemble some parts of the function, so I'm testing with that, although it's likely there are other things that were broken due to this.
llvm#137955) Continuing the theme from llvm#116777 and llvm#124931, this patch ensures we compute the correct address when a functions is spread across multiple sections. Due to this, it's not sufficient to adjust the offset in the section+offset pair (Address::Slide). We must actually slide the file offset and then recompute the section using the result. I found this out due to a failure to disassemble some parts of the function, so I'm testing with that, although it's likely there are other things that were broken due to this.
The problem here is the assumption that the entire function will be placed in a single section. This will ~never be the case for a discontinuous function, as the point of splitting the function is to let the linker group parts of the function according to their "hotness".
The fix is to change the offset computation to use file addresses instead.