Skip to content

[lldb] Correctly resolve (discontinuous) function offsets when disassembling #126925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 13, 2025

Conversation

labath
Copy link
Collaborator

@labath labath commented Feb 12, 2025

We need to iterate through the all symbol context ranges returned by (since #126505) SymbolContext::GetAddressRange. This also includes a fix to print the function offsets as signed values.

I've also wanted to check that the addresses which are in the middle of the function do not resolve to the function, but that's not entirely the case right now. This appears to be a separate issue though, so I've just left a TODO for now.

…embling

We need to iterate through the all symbol context ranges returned by
(since llvm#126505) SymbolContext::GetAddressRange. This also includes a fix
to print the function offsets as signed values.

I've also wanted to check that the addresses which are in the middle of
the function do *not* resolve to the function, but that's not entirely
the case right now. This appears to be a separate issue though, so I've
just left a TODO for now.
@llvmbot
Copy link
Member

llvmbot commented Feb 12, 2025

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

We need to iterate through the all symbol context ranges returned by (since #126505) SymbolContext::GetAddressRange. This also includes a fix to print the function offsets as signed values.

I've also wanted to check that the addresses which are in the middle of the function do not resolve to the function, but that's not entirely the case right now. This appears to be a separate issue though, so I've just left a TODO for now.


Full diff: https://github.com/llvm/llvm-project/pull/126925.diff

3 Files Affected:

  • (modified) lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp (+8-4)
  • (modified) lldb/source/Symbol/SymbolContext.cpp (+9-5)
  • (modified) lldb/test/Shell/Commands/command-disassemble.s (+8-5)
diff --git a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
index 76f2db086476f..a77155f6bf41e 100644
--- a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
+++ b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
@@ -31,6 +31,7 @@
 
 #include "lldb/Core/Address.h"
 #include "lldb/Core/Module.h"
+#include "lldb/Symbol/Function.h"
 #include "lldb/Symbol/SymbolContext.h"
 #include "lldb/Target/ExecutionContext.h"
 #include "lldb/Target/Process.h"
@@ -1806,10 +1807,13 @@ const char *DisassemblerLLVMC::SymbolLookup(uint64_t value, uint64_t *type_ptr,
         bool format_omitting_current_func_name = false;
         if (sym_ctx.symbol || sym_ctx.function) {
           AddressRange range;
-          if (sym_ctx.GetAddressRange(resolve_scope, 0, false, range) &&
-              range.GetBaseAddress().IsValid() &&
-              range.ContainsLoadAddress(value_so_addr, target)) {
-            format_omitting_current_func_name = true;
+          for (uint32_t idx = 0;
+               sym_ctx.GetAddressRange(resolve_scope, idx, false, range);
+               ++idx) {
+            if (range.ContainsLoadAddress(value_so_addr, target)) {
+              format_omitting_current_func_name = true;
+              break;
+            }
           }
         }
 
diff --git a/lldb/source/Symbol/SymbolContext.cpp b/lldb/source/Symbol/SymbolContext.cpp
index 4725df52ff559..183947a694363 100644
--- a/lldb/source/Symbol/SymbolContext.cpp
+++ b/lldb/source/Symbol/SymbolContext.cpp
@@ -104,15 +104,19 @@ bool SymbolContext::DumpStopContext(
 
     if (addr_t file_addr = addr.GetFileAddress();
         file_addr != LLDB_INVALID_ADDRESS) {
-      const addr_t function_offset =
-          file_addr - function->GetAddress().GetFileAddress();
+      // Avoiding signed arithmetic due to UB in -INT_MAX.
+      const char sign =
+          file_addr >= function->GetAddress().GetFileAddress() ? '+' : '-';
+      addr_t offset = file_addr - function->GetAddress().GetFileAddress();
+      if (sign == '-')
+        offset = -offset;
       if (!show_function_name) {
         // Print +offset even if offset is 0
         dumped_something = true;
-        s->Printf("+%" PRIu64 ">", function_offset);
-      } else if (function_offset) {
+        s->Format("{0}{1}>", sign, offset);
+      } else if (offset) {
         dumped_something = true;
-        s->Printf(" + %" PRIu64, function_offset);
+        s->Format(" {0} {1}", sign, offset);
       }
     }
 
diff --git a/lldb/test/Shell/Commands/command-disassemble.s b/lldb/test/Shell/Commands/command-disassemble.s
index 951d96cefd4b9..eb84a9ce39d4a 100644
--- a/lldb/test/Shell/Commands/command-disassemble.s
+++ b/lldb/test/Shell/Commands/command-disassemble.s
@@ -84,7 +84,7 @@
 # CHECK-NEXT: command-disassemble.s.tmp[0x2044] <+0>: int    $0x32
 # CHECK-NEXT: warning: Not disassembling a function because it is very large [0x0000000000002046-0x0000000000004046). To disassemble specify an instruction count limit, start/stop addresses or use the --force option.
 # CHECK-NEXT: (lldb) disassemble --name case3
-# CHECK-NEXT: error: Not disassembling a function because it is very large [0x0000000000006046-0x0000000000007046)[0x0000000000009046-0x000000000000a046). To disassemble specify an instruction count limit, start/stop addresses or use the --force option.
+# CHECK-NEXT: error: Not disassembling a function because it is very large [0x0000000000006046-0x0000000000007046)[0x0000000000009046-0x000000000000a050). To disassemble specify an instruction count limit, start/stop addresses or use the --force option.
 # CHECK-NEXT: Not disassembling a function because it is very large [0x0000000000004046-0x0000000000006046). To disassemble specify an instruction count limit, start/stop addresses or use the --force option.
 # CHECK-NEXT: (lldb) disassemble --name case3 --count 3
 # CHECK-NEXT: command-disassemble.s.tmp`n2::case3:
@@ -93,9 +93,10 @@
 # CHECK-NEXT: command-disassemble.s.tmp[0x604a] <-12284>: int    $0x2a
 # CHECK-EMPTY:
 # CHECK-NEXT: command-disassemble.s.tmp`n2::case3:
-# CHECK-NEXT: command-disassemble.s.tmp[0x9046] <+0>: int    $0x2a
-# CHECK-NEXT: command-disassemble.s.tmp[0x9048] <+2>: int    $0x2a
-# CHECK-NEXT: command-disassemble.s.tmp[0x904a] <+4>: int    $0x2a
+# CHECK-NEXT: command-disassemble.s.tmp[0x9046] <+0>: jmp 0x6046 ; <-12288>
+## FIXME: This should resolve to `middle_of_case3`
+# CHECK-NEXT: command-disassemble.s.tmp[0x904b] <+5>: jmp 0x7046 ; n2::case3 - 8192
+# CHECK-NEXT: command-disassemble.s.tmp[0x9050] <+10>: int    $0x2a
 # CHECK-EMPTY:
 # CHECK-NEXT: command-disassemble.s.tmp`n1::case3:
 # CHECK-NEXT: command-disassemble.s.tmp[0x4046] <+0>: int    $0x2a
@@ -171,12 +172,14 @@ _ZN2n15case3Ev:
         .endr
 .L_ZN2n25case3Ev.__part.1_end:
 
-.Lpadding:
+middle_of_case3:
         .rept 0x1000
         int $42
         .endr
 
 _ZN2n25case3Ev:
+        jmp .L_ZN2n25case3Ev.__part.1
+        jmp middle_of_case3
         .rept 0x800
         int $42
         .endr

Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks straightforward.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@labath labath merged commit 499d6da into llvm:main Feb 13, 2025
9 checks passed
@labath labath deleted the disasm branch February 13, 2025 10:30
flovent pushed a commit to flovent/llvm-project that referenced this pull request Feb 13, 2025
…embling (llvm#126925)

We need to iterate through the all symbol context ranges returned by
(since llvm#126505) SymbolContext::GetAddressRange. This also includes a fix
to print the function offsets as signed values.

I've also wanted to check that the addresses which are in the middle of
the function do *not* resolve to the function, but that's not entirely
the case right now. This appears to be a separate issue though, so I've
just left a TODO for now.
joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025
…embling (llvm#126925)

We need to iterate through the all symbol context ranges returned by
(since llvm#126505) SymbolContext::GetAddressRange. This also includes a fix
to print the function offsets as signed values.

I've also wanted to check that the addresses which are in the middle of
the function do *not* resolve to the function, but that's not entirely
the case right now. This appears to be a separate issue though, so I've
just left a TODO for now.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…embling (llvm#126925)

We need to iterate through the all symbol context ranges returned by
(since llvm#126505) SymbolContext::GetAddressRange. This also includes a fix
to print the function offsets as signed values.

I've also wanted to check that the addresses which are in the middle of
the function do *not* resolve to the function, but that's not entirely
the case right now. This appears to be a separate issue though, so I've
just left a TODO for now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants