[lldb] Fix inline function resolution for discontinuous functions #116777

labath · 2024-11-19T10:01:58Z

The problem here is the assumption that the entire function will be placed in a single section. This will ~never be the case for a discontinuous function, as the point of splitting the function is to let the linker group parts of the function according to their "hotness".

The fix is to change the offset computation to use file addresses instead.

The problem here is the assumption that the entire function will be placed in a single section. This will ~never be the case for a discontinuous function, as the point of splitting the function is to let the linker group parts of the function according to their "hotness". The fix is to change the offset computation to use file addresses instead.

llvmbot · 2024-11-19T10:02:37Z

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

The problem here is the assumption that the entire function will be placed in a single section. This will ~never be the case for a discontinuous function, as the point of splitting the function is to let the linker group parts of the function according to their "hotness".

The fix is to change the offset computation to use file addresses instead.

Full diff: https://github.com/llvm/llvm-project/pull/116777.diff

3 Files Affected:

(modified) lldb/source/Symbol/Block.cpp (+10-9)
(modified) lldb/source/Symbol/SymbolContext.cpp (+6-4)
(added) lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s (+167)

diff --git a/lldb/source/Symbol/Block.cpp b/lldb/source/Symbol/Block.cpp
index 5c7772a6db780d..8746a25e3fad5a 100644
--- a/lldb/source/Symbol/Block.cpp
+++ b/lldb/source/Symbol/Block.cpp
@@ -252,19 +252,20 @@ bool Block::GetRangeContainingAddress(const Address &addr,
   Function *function = CalculateSymbolContextFunction();
   if (function) {
     const AddressRange &func_range = function->GetAddressRange();
-    if (addr.GetSection() == func_range.GetBaseAddress().GetSection()) {
-      const addr_t addr_offset = addr.GetOffset();
-      const addr_t func_offset = func_range.GetBaseAddress().GetOffset();
-      if (addr_offset >= func_offset &&
-          addr_offset < func_offset + func_range.GetByteSize()) {
-        addr_t offset = addr_offset - func_offset;
+    if (addr.GetModule() == func_range.GetBaseAddress().GetModule()) {
+      const addr_t file_addr = addr.GetFileAddress();
+      const addr_t func_file_addr =
+          func_range.GetBaseAddress().GetFileAddress();
+      if (file_addr >= func_file_addr &&
+          file_addr < func_file_addr + func_range.GetByteSize()) {
+        addr_t offset = file_addr - func_file_addr;
 
         const Range *range_ptr = m_ranges.FindEntryThatContains(offset);
 
         if (range_ptr) {
-          range.GetBaseAddress() = func_range.GetBaseAddress();
-          range.GetBaseAddress().SetOffset(func_offset +
-                                           range_ptr->GetRangeBase());
+          range.GetBaseAddress() =
+              Address(func_file_addr + range_ptr->GetRangeBase(),
+                      addr.GetModule()->GetSectionList());
           range.SetByteSize(range_ptr->GetByteSize());
           return true;
         }
diff --git a/lldb/source/Symbol/SymbolContext.cpp b/lldb/source/Symbol/SymbolContext.cpp
index de083e81206e2a..1a291ca3c0ea7a 100644
--- a/lldb/source/Symbol/SymbolContext.cpp
+++ b/lldb/source/Symbol/SymbolContext.cpp
@@ -102,10 +102,11 @@ bool SymbolContext::DumpStopContext(
         s->PutCStringColorHighlighted(name.GetStringRef(), settings);
     }
 
-    if (addr.IsValid()) {
+    if (addr_t file_addr = addr.GetFileAddress();
+        file_addr != LLDB_INVALID_ADDRESS) {
       const addr_t function_offset =
-          addr.GetOffset() -
-          function->GetAddressRange().GetBaseAddress().GetOffset();
+          file_addr -
+          function->GetAddressRange().GetBaseAddress().GetFileAddress();
       if (!show_function_name) {
         // Print +offset even if offset is 0
         dumped_something = true;
@@ -126,7 +127,8 @@ bool SymbolContext::DumpStopContext(
       lldb_private::AddressRange block_range;
       if (inlined_block->GetRangeContainingAddress(addr, block_range)) {
         const addr_t inlined_function_offset =
-            addr.GetOffset() - block_range.GetBaseAddress().GetOffset();
+            addr.GetFileAddress() -
+            block_range.GetBaseAddress().GetFileAddress();
         if (inlined_function_offset) {
           s->Printf(" + %" PRIu64, inlined_function_offset);
         }
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s b/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s
new file mode 100644
index 00000000000000..399f4e4db5b2f1
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s
@@ -0,0 +1,167 @@
+## Test that inline function resolution works when the function has been split
+## into multiple discontinuous parts (and those parts are placed in different
+## sections)
+
+# RUN: llvm-mc -triple x86_64-pc-linux -filetype=obj %s -o %t
+# RUN: %lldb %t -o "image lookup -v -n look_me_up" -o exit | FileCheck %s
+
+# CHECK:      1 match found in {{.*}}
+# CHECK:      Summary: {{.*}}`foo + 6 [inlined] foo_inl + 1
+# CHECK-NEXT:          {{.*}}`foo + 5
+# CHECK:      Blocks: id = {{.*}}, ranges = [0x00000000-0x00000003)[0x00000004-0x00000008)
+# CHECK-NEXT:         id = {{.*}}, ranges = [0x00000001-0x00000002)[0x00000005-0x00000007), name = "foo_inl"
+
+        .text
+
+        .type   foo,@function
+foo:
+        nop
+.Lfoo_inl:
+        nop
+.Lfoo_inl_end:
+        nop
+.Lfoo_end:
+        .size   foo, .Lfoo_end-foo
+
+bar:
+        nop
+.Lbar_end:
+        .size   bar, .Lbar_end-bar
+
+        .section        .text.__part1,"ax",@progbits
+foo.__part.1:
+        nop
+.Lfoo_inl.__part.1:
+        nop
+        .type   look_me_up,@function
+look_me_up:
+        nop
+.Lfoo_inl.__part.1_end:
+        nop
+.Lfoo.__part.1_end:
+        .size   foo.__part.1, .Lfoo.__part.1_end-foo.__part.1
+
+
+        .section        .debug_abbrev,"",@progbits
+        .byte   1                               # Abbreviation Code
+        .byte   17                              # DW_TAG_compile_unit
+        .byte   1                               # DW_CHILDREN_yes
+        .byte   37                              # DW_AT_producer
+        .byte   8                               # DW_FORM_string
+        .byte   19                              # DW_AT_language
+        .byte   5                               # DW_FORM_data2
+        .byte   17                              # DW_AT_low_pc
+        .byte   1                               # DW_FORM_addr
+        .byte   85                              # DW_AT_ranges
+        .byte   35                              # DW_FORM_rnglistx
+        .byte   116                             # DW_AT_rnglists_base
+        .byte   23                              # DW_FORM_sec_offset
+        .byte   0                               # EOM(1)
+        .byte   0                               # EOM(2)
+        .byte   2                               # Abbreviation Code
+        .byte   46                              # DW_TAG_subprogram
+        .byte   0                               # DW_CHILDREN_no
+        .byte   3                               # DW_AT_name
+        .byte   8                               # DW_FORM_string
+        .byte   0                               # EOM(1)
+        .byte   0                               # EOM(2)
+        .byte   3                               # Abbreviation Code
+        .byte   46                              # DW_TAG_subprogram
+        .byte   1                               # DW_CHILDREN_yes
+        .byte   85                              # DW_AT_ranges
+        .byte   35                              # DW_FORM_rnglistx
+        .byte   3                               # DW_AT_name
+        .byte   8                               # DW_FORM_string
+        .byte   0                               # EOM(1)
+        .byte   0                               # EOM(2)
+        .byte   4                               # Abbreviation Code
+        .byte   29                              # DW_TAG_inlined_subroutine
+        .byte   0                               # DW_CHILDREN_no
+        .byte   85                              # DW_AT_ranges
+        .byte   35                              # DW_FORM_rnglistx
+        .byte   49                              # DW_AT_abstract_origin
+        .byte   19                              # DW_FORM_ref4
+        .byte   0                               # EOM(1)
+        .byte   0                               # EOM(2)
+        .byte   0                               # EOM(3)
+
+        .section        .debug_info,"",@progbits
+.Lcu_begin0:
+        .long   .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
+.Ldebug_info_start0:
+        .short  5                               # DWARF version number
+        .byte   1                               # DWARF Unit Type
+        .byte   8                               # Address Size (in bytes)
+        .long   .debug_abbrev                   # Offset Into Abbrev. Section
+        .byte   1                               # Abbrev DW_TAG_compile_unit
+        .asciz  "Hand-written DWARF"            # DW_AT_producer
+        .short  29                              # DW_AT_language
+        .quad   0                               # DW_AT_low_pc
+        .byte   1                               # DW_AT_ranges
+        .long   .Lrnglists_table_base0          # DW_AT_rnglists_base
+
+        .byte   3                               # Abbrev DW_TAG_subprogram
+        .byte   2                               # DW_AT_ranges
+        .asciz  "bar"                           # DW_AT_name
+        .byte   0                               # End Of Children Mark
+
+.Lfoo_inl_die:
+        .byte   2                               # Abbrev DW_TAG_subprogram
+        .asciz  "foo_inl"                       # DW_AT_name
+
+        .byte   3                               # Abbrev DW_TAG_subprogram
+        .byte   0                               # DW_AT_ranges
+        .asciz  "foo"                           # DW_AT_name
+        .byte   4                               # Abbrev DW_TAG_inlined_subroutine
+        .byte   3                               # DW_AT_ranges
+        .long   .Lfoo_inl_die-.Lcu_begin0       # DW_AT_abstract_origin
+        .byte   0                               # End Of Children Mark
+
+        .byte   0                               # End Of Children Mark
+.Ldebug_info_end0:
+
+        .section        .debug_rnglists,"",@progbits
+        .long   .Ldebug_list_header_end0-.Ldebug_list_header_start0 # Length
+.Ldebug_list_header_start0:
+        .short  5                               # Version
+        .byte   8                               # Address size
+        .byte   0                               # Segment selector size
+        .long   4                               # Offset entry count
+.Lrnglists_table_base0:
+        .long   .Ldebug_ranges0-.Lrnglists_table_base0
+        .long   .Ldebug_ranges1-.Lrnglists_table_base0
+        .long   .Ldebug_ranges2-.Lrnglists_table_base0
+        .long   .Ldebug_ranges3-.Lrnglists_table_base0
+.Ldebug_ranges0:
+        .byte   6                               # DW_RLE_start_end
+        .quad   foo
+        .quad   .Lfoo_end
+        .byte   6                               # DW_RLE_start_end
+        .quad   foo.__part.1
+        .quad   .Lfoo.__part.1_end
+        .byte   0                               # DW_RLE_end_of_list
+.Ldebug_ranges1:
+        .byte   6                               # DW_RLE_start_end
+        .quad   bar
+        .quad   .Lbar_end
+        .byte   6                               # DW_RLE_start_end
+        .quad   foo.__part.1
+        .quad   .Lfoo.__part.1_end
+        .byte   6                               # DW_RLE_start_end
+        .quad   foo
+        .quad   .Lfoo_end
+        .byte   0                               # DW_RLE_end_of_list
+.Ldebug_ranges2:
+        .byte   6                               # DW_RLE_start_end
+        .quad   bar
+        .quad   .Lbar_end
+        .byte   0                               # DW_RLE_end_of_list
+.Ldebug_ranges3:
+        .byte   6                               # DW_RLE_start_end
+        .quad   .Lfoo_inl
+        .quad   .Lfoo_inl_end
+        .byte   6                               # DW_RLE_start_end
+        .quad   .Lfoo_inl.__part.1
+        .quad   .Lfoo_inl.__part.1_end
+        .byte   0                               # DW_RLE_end_of_list
+.Ldebug_list_header_end0:

jasonmolenda

Sorry for the delay, I read the patch quickly yesterday without reading the description and was a little confused. Read the description and seeing the test case, I see what this is doing. Looks good to me. I didn't realize the discontiguous regions would be split into different sections with this codegen pattern.

labath · 2024-11-22T07:32:35Z

Yeah, in my previous patches, the code was all in one section as it wasn't relevant for the test, but the way this really works is that every part of the function gets its own subsection (on MachO, I guess you'd use subsections for that). That way the linker doesn't need to do anything special -- for all it cares, the individual chunks are just separate functions (with strange calling conventions).

…nctions This is a followup to llvm#122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in llvm#116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).

…nctions (#124931) This is a followup to #122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in #116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).

…nctions (llvm#124931) This is a followup to llvm#122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in llvm#116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).

Continuing the theme from llvm#116777 and llvm#124931, this patch ensures we compute the correct address when a functions is spread across multiple sections. Due to this, it's not sufficient to adjust the offset in the section+offset pair (Address::Slide). We must actually slide the file offset and then recompute the section using the result.

#137955) Continuing the theme from #116777 and #124931, this patch ensures we compute the correct address when a functions is spread across multiple sections. Due to this, it's not sufficient to adjust the offset in the section+offset pair (Address::Slide). We must actually slide the file offset and then recompute the section using the result. I found this out due to a failure to disassemble some parts of the function, so I'm testing with that, although it's likely there are other things that were broken due to this.

llvm#137955) Continuing the theme from llvm#116777 and llvm#124931, this patch ensures we compute the correct address when a functions is spread across multiple sections. Due to this, it's not sufficient to adjust the offset in the section+offset pair (Address::Slide). We must actually slide the file offset and then recompute the section using the result. I found this out due to a failure to disassemble some parts of the function, so I'm testing with that, although it's likely there are other things that were broken due to this.

labath requested a review from jasonmolenda November 19, 2024 10:01

labath requested a review from JDevlieghere as a code owner November 19, 2024 10:01

llvmbot added the lldb label Nov 19, 2024

jasonmolenda approved these changes Nov 20, 2024

View reviewed changes

labath merged commit d71fa33 into llvm:main Nov 22, 2024
9 checks passed

labath deleted the unwind-inline branch November 22, 2024 07:32

labath mentioned this pull request Jan 29, 2025

[lldb] Fix Block::GetRangeIndexContainingAddress for discontinuous functions #124931

Merged

labath mentioned this pull request Apr 30, 2025

[lldb] Fix block address resolution for functions in multiple sections #137955

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[lldb] Fix inline function resolution for discontinuous functions #116777

[lldb] Fix inline function resolution for discontinuous functions #116777

Uh oh!

labath commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

jasonmolenda left a comment

Uh oh!

labath commented Nov 22, 2024

Uh oh!

Uh oh!

Uh oh!

[lldb] Fix inline function resolution for discontinuous functions #116777

[lldb] Fix inline function resolution for discontinuous functions #116777

Uh oh!

Conversation

labath commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

jasonmolenda left a comment

Choose a reason for hiding this comment

Uh oh!

labath commented Nov 22, 2024

Uh oh!

Uh oh!

Uh oh!