Description
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2322754
Fedora rawhide has a new version of binutils that ends up splitting NOTEs into separate LOAD segments, and one of those has an extra 0x10000 memory offset, relative to its file offset. This isn't accounted for in __llvm_write_binary_ids
, and on aarch64 it ends up trying to read between segments for a SIGSEGV. I believe the code is wrong regardless of arch though.
Steps to Reproduce:
echo 'int main() {}' >main.c
clang -fprofile-instr-generate -fcoverage-mapping main.c -o main
./main
With the old binutils, we would get readelf -Wl
like this:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R 0x8
INTERP 0x00027c 0x000000000040027c 0x000000000040027c 0x00001b 0x00001b R 0x1
[Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0093b8 0x0093b8 R E 0x10000
LOAD 0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x006510 0x008788 RW 0x10000
DYNAMIC 0x00fd38 0x000000000041fd38 0x000000000041fd38 0x0001e0 0x0001e0 RW 0x8
NOTE 0x000238 0x0000000000400238 0x0000000000400238 0x000044 0x000044 R 0x4
GNU_EH_FRAME 0x0078c4 0x00000000004078c4 0x00000000004078c4 0x000474 0x000474 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x000308 0x000308 R 0x1
After the update, the instrumented program crashes in __llvm_write_binary_ids
. The headers look like:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x000230 0x000230 R 0x8
INTERP 0x000294 0x0000000000400294 0x0000000000400294 0x00001b 0x00001b R 0x1
[Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0093f8 0x0093f8 R E 0x10000
LOAD 0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x0087a8 0x0087a8 RW 0x10000
DYNAMIC 0x00fd38 0x000000000041fd38 0x000000000041fd38 0x0001e0 0x0001e0 RW 0x8
NOTE 0x000270 0x0000000000400270 0x0000000000400270 0x000024 0x000024 R 0x4
NOTE 0x018480 0x0000000000428480 0x0000000000428480 0x000020 0x000020 R 0x4
GNU_EH_FRAME 0x007904 0x0000000000407904 0x0000000000407904 0x000474 0x000474 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x000308 0x000308 R 0x1
It crashes because the following line computes the second "Note = 0x400000 + 0x018480 = 0x418480", which is between LOAD segments, and the note is actually loaded at 0x428480
.
llvm-project/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
Lines 218 to 219 in f54cdc5
That code is in the first branch of an if-else
, which by comments is intended for inspecting files, while the else
is for inspecting memory. However, the condition for memsz == filesz
is met regardless. The other case still wouldn't compute the right address either, because ElfHeader + vaddr
would double-count the 0x400000
base address. That would probably be right for PIE or SOs though.
llvm-project/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
Lines 227 to 228 in f54cdc5
I reproduced this using Fedora's clang build, but the code in question hasn't changed since 2021 in commit f261e25.