Skip to content

msan + libunwind fails to resolve stack pthread_cond_signal -> thread terminate #128621

Open
@grooverdan

Description

@grooverdan

Looks almost related but I can't tell #119437

my ref: https://jira.mariadb.org/browse/MDBF-793

Not minimized however when a pthread_cond_signal to a different pthread_cond_wait thread, and then thread exit, results in a SEGV + unresolvable stack at the point at which the thread switches.

Removing the unwind libraries caused a correct execution.

Info:

Debian clang version 20.1.0 (++20250221063039+dc1bd6a8fa6a-1~exp1~20250221183151.60)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-20/bin

code: https://github.com/MariaDB/server/blob/main/storage/maria/unittest/ma_pagecache_consist.c#L265

Built and msan linked:

buildbot@2f11790aa1e1:/build$ ldd ./storage/maria/unittest/ma_pagecache_consist_1kRD-t
	linux-vdso.so.1 (0x00007fb319378000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb313861000)
	libssl.so.3 => /msan-libs/libssl.so.3 (0x00007fb313753000)
	libcrypto.so.3 => /msan-libs/libcrypto.so.3 (0x00007fb3131c7000)
	libc++.so.1 => /msan-libs/libc++.so.1 (0x00007fb312f37000)
	libc++abi.so.1 => /msan-libs/libc++abi.so.1 (0x00007fb312e72000)
	libunwind.so.1 => /msan-libs/libunwind.so.1 (0x00007fb312e51000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fb312e40000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb312e20000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb312c3f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fb31937a000)

Straight run in gdb:

(gdb) r
Starting program: /build/storage/maria/unittest/ma_pagecache_consist_1kRD-t 
warning: Error disabling address space randomization: Function not implemented
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
1..6
[New Thread 0x7f993d2f96c0 (LWP 9545)]
[New Thread 0x7f993c7f46c0 (LWP 9546)]
[New Thread 0x7f993bcef6c0 (LWP 9547)]
[New Thread 0x7f993b1ea6c0 (LWP 9548)]
[New Thread 0x7f993a6e56c0 (LWP 9549)]
[New Thread 0x7f9939be06c0 (LWP 9550)]
ok 1 - reader5: done
[Thread 0x7f993d2f96c0 (LWP 9545) exited]
ok 2 - reader4: done
[Thread 0x7f993bcef6c0 (LWP 9547) exited]
ok 3 - reader1: done
[Thread 0x7f9939be06c0 (LWP 9550) exited]
ok 4 - reader3: done
[Thread 0x7f993b1ea6c0 (LWP 9548) exited]
ok 5 - reader2: done

Thread 6 "ma_pagecache_co" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f993a6e56c0 (LWP 9549)]
0x00007f994036f67c in _Unwind_Backtrace () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:134
134	/msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c: No such file or directory.
(gdb) bt
#0  0x00007f994036f67c in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:134
#1  0x000055562ed47bcb in __sanitizer::BufferedStackTrace::UnwindSlow(unsigned long, unsigned int) ()
#2  0x000055562ed4119d in __sanitizer::BufferedStackTrace::Unwind(unsigned int, unsigned long, unsigned long, void*, unsigned long, unsigned long, bool) ()
#3  0x000055562ed4ec95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
#4  0x000055562ed4e8bf in __msan::PrintWarningWithOrigin(unsigned long, unsigned long, unsigned int) ()
#5  0x000055562ed4f182 in __msan_warning_with_origin_noreturn ()
#6  0x00007f994035f565 in getReg () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:1353
#7  0x00007f994035fbef in setInfoBasedOnIPRegister ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:2561
#8  0x00007f994035b8fa in __unw_init_local () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/libunwind.cpp:91
#9  0x00007f994036f71d in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:137
#10 0x000055562ed47bcb in __sanitizer::BufferedStackTrace::UnwindSlow(unsigned long, unsigned int) ()
#11 0x000055562ed4119d in __sanitizer::BufferedStackTrace::Unwind(unsigned int, unsigned long, unsigned long, void*, unsigned long, unsigned long, bool) ()
#12 0x000055562ed4ec95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
#13 0x000055562ed4e8bf in __msan::PrintWarningWithOrigin(unsigned long, unsigned long, unsigned int) ()
#14 0x000055562ed4f182 in __msan_warning_with_origin_noreturn ()
#15 0x00007f994035f565 in getReg () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:1353
#16 0x00007f994035fbef in setInfoBasedOnIPRegister ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:2561
#17 0x00007f994035b8fa in __unw_init_local () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/libunwind.cpp:91
#18 0x00007f994036f71d in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:137
#19 0x000055562ed47bcb in __sanitizer::BufferedStackTrace::UnwindSlow(unsigned long, unsigned int) ()
#20 0x000055562ed4119d in __sanitizer::BufferedStackTrace::Unwind(unsigned int, unsigned long, unsigned long, void*, unsigned long, unsigned long, bool) ()
#21 0x000055562ed4ec95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
#22 0x000055562ed4e8bf in __msan::PrintWarningWithOrigin(unsigned long, unsigned long, unsigned int) ()
#23 0x000055562ed4f182 in __msan_warning_with_origin_noreturn ()
#24 0x00007f994035f565 in getReg () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:1353
#25 0x00007f994035fbef in setInfoBasedOnIPRegister ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:2561
#26 0x00007f994035b8fa in __unw_init_local () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/libunwind.cpp:91
#27 0x00007f994036f71d in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:137
#28 0x000055562ed47bcb in __sanitizer::BufferedStackTrace::UnwindSlow(unsigned long, unsigned int) ()
#29 0x000055562ed4119d in __sanitizer::BufferedStackTrace::Unwind(unsigned int, unsigned long, unsigned long, void*, unsigned long, unsigned long, bool) ()
#30 0x000055562ed4ec95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
#31 0x000055562ed4e8bf in __msan::PrintWarningWithOrigin(unsigned long, unsigned long, unsigned int) ()
#32 0x000055562ed4f182 in __msan_warning_with_origin_noreturn ()
#33 0x00007f994035f565 in getReg () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:1353
#34 0x00007f994035fbef in setInfoBasedOnIPRegister ()
....
#17067 0x000055562ed4ec95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
#17068 0x000055562ed4e8bf in __msan::PrintWarningWithOrigin(unsigned long, unsigned long, unsigned int) ()
#17069 0x000055562ed4f182 in __msan_warning_with_origin_noreturn ()
#17070 0x00007f994035f565 in getReg () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:1353
#17071 0x00007f994035fbef in setInfoBasedOnIPRegister ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:2561
#17072 0x00007f994035b8fa in __unw_init_local () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/libunwind.cpp:91
#17073 0x00007f994036f71d in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:137
#17074 0x000055562ed47bcb in __sanitizer::BufferedStackTrace::UnwindSlow(unsigned long, unsigned int) ()
#17075 0x000055562ed4119d in __sanitizer::BufferedStackTrace::Unwind(unsigned int, unsigned long, unsigned long, void*, unsigned long, unsigned long, bool)
    ()
#17076 0x000055562ed4ec95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
#17077 0x000055562ed55f2d in __interceptor_free ()
#17078 0x00007f9940e696df in _dl_deallocate_tls () from /lib64/ld-linux-x86-64.so.2
#17079 0x00007f99401cd29d in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#17080 0x00007f99401cd3ed in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#17081 0x00007f99401d0152 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#17082 0x00007f994025085c in ?? () from /lib/x86_64-linux-gnu/libc.so.6

Taking a breakpoint at the end of the thread and single stepping triggers the SEGV:

(gdb) r
Starting program: /build/storage/maria/unittest/ma_pagecache_consist_1kRD-t 
warning: Error disabling address space randomization: Function not implemented
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
1..6
[New Thread 0x7fa6d06f96c0 (LWP 9725)]
[New Thread 0x7fa6cfbf46c0 (LWP 9726)]
[New Thread 0x7fa6cf0ef6c0 (LWP 9727)]
[New Thread 0x7fa6ce5ea6c0 (LWP 9728)]
[New Thread 0x7fa6cdae56c0 (LWP 9729)]
[New Thread 0x7fa6ccfe06c0 (LWP 9730)]
ok 1 - reader1: done
[Switching to Thread 0x7fa6ccfe06c0 (LWP 9730)]

Thread 7 "ma_pagecache_co" hit Breakpoint 1, test_thread_reader (arg=0x701000000050) at /source/storage/maria/unittest/ma_pagecache_consist.c:284
284	  return 0;
(gdb) c
Continuing.
ok 2 - reader4: done
[Thread 0x7fa6ccfe06c0 (LWP 9730) exited]
[Switching to Thread 0x7fa6cf0ef6c0 (LWP 9727)]

Thread 4 "ma_pagecache_co" hit Breakpoint 1, test_thread_reader (arg=0x701000000020) at /source/storage/maria/unittest/ma_pagecache_consist.c:284
284	  return 0;
(gdb) c
Continuing.
[Thread 0x7fa6cf0ef6c0 (LWP 9727) exited]
ok 3 - reader5: done
[Switching to Thread 0x7fa6d06f96c0 (LWP 9725)]

Thread 2 "ma_pagecache_co" hit Breakpoint 1, test_thread_reader (arg=0x701000000000) at /source/storage/maria/unittest/ma_pagecache_consist.c:284
284	  return 0;
(gdb) c
Continuing.
[Thread 0x7fa6d06f96c0 (LWP 9725) exited]
ok 4 - reader2: done
ok 5 - reader3: done
[Switching to Thread 0x7fa6cdae56c0 (LWP 9729)]

Thread 6 "ma_pagecache_co" hit Breakpoint 1, test_thread_reader (arg=0x701000000040) at /source/storage/maria/unittest/ma_pagecache_consist.c:284
284	  return 0;
(gdb) thread apply all bt

Thread 6 (Thread 0x7fa6cdae56c0 (LWP 9729) "ma_pagecache_co"):
#0  test_thread_reader (arg=0x701000000040) at /source/storage/maria/unittest/ma_pagecache_consist.c:284
#1  0x00007fa6d344c1c4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fa6d34cc85c in ?? () from /lib/x86_64-linux-gnu/libc.so.6

Thread 5 (Thread 0x7fa6ce5ea6c0 (LWP 9728) "ma_pagecache_co"):
#0  test_thread_reader (arg=0x701000000030) at /source/storage/maria/unittest/ma_pagecache_consist.c:284
#1  0x00007fa6d344c1c4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fa6d34cc85c in ?? () from /lib/x86_64-linux-gnu/libc.so.6

Thread 3 (Thread 0x7fa6cfbf46c0 (LWP 9726) "ma_pagecache_co"):
#0  0x000055622e084d41 in check_page (buff=buff@entry=0x718000020000 "i", offset=1024, page_locked=page_locked@entry=1, page_no=page_no@entry=1, tag=tag@entry=1) at /source/storage/maria/unittest/ma_pagecache_consist.c:114
#1  0x000055622e085591 in writer (num=num@entry=1) at /source/storage/maria/unittest/ma_pagecache_consist.c:249
#2  0x000055622e0866cb in test_thread_writer (arg=0x701000000010) at /source/storage/maria/unittest/ma_pagecache_consist.c:296
#3  0x00007fa6d344c1c4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007fa6d34cc85c in ?? () from /lib/x86_64-linux-gnu/libc.so.6

Thread 1 (Thread 0x7fa6d33be740 (LWP 9722) "ma_pagecache_co"):
#0  0x00007fa6d3448f16 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fa6d344b5d8 in pthread_cond_wait () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x000055622e085f8f in main (argc=<optimized out>, argv=<optimized out>) at /source/storage/maria/unittest/ma_pagecache_consist.c:471
(gdb) p LOCK_thread_count
$1 = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 1, __kind = 3, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
  __size = '\000' <repeats 12 times>, "\001\000\000\000\003", '\000' <repeats 22 times>, __align = 0}
(gdb) watch -l LOCK_thread_count.__data.__owner 
Hardware watchpoint 2: -location LOCK_thread_count.__data.__owner
(gdb) s
[Switching to Thread 0x7fa6ce5ea6c0 (LWP 9728)]

Thread 5 "ma_pagecache_co" hit Breakpoint 1, test_thread_reader (arg=0x701000000030) at /source/storage/maria/unittest/ma_pagecache_consist.c:284
284	  return 0;
(gdb) s
[Thread 0x7fa6cdae56c0 (LWP 9729) exited]

Thread 5 "ma_pagecache_co" received signal SIGSEGV, Segmentation fault.
0x00007fa6d35eb67c in _Unwind_Backtrace () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:134
134	/msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c: No such file or directory.
(gdb) bt full
#0  0x00007fa6d35eb67c in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:134
No locals.
#1  0x000055622e00ebcb in __sanitizer::BufferedStackTrace::UnwindSlow(unsigned long, unsigned int) ()
No symbol table info available.
#2  0x000055622e00819d in __sanitizer::BufferedStackTrace::Unwind(unsigned int, unsigned long, unsigned long, void*, unsigned long, unsigned long, bool) ()
No symbol table info available.
#3  0x000055622e015c95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
No symbol table info available.
#4  0x000055622e0158bf in __msan::PrintWarningWithOrigin(unsigned long, unsigned long, unsigned int) ()
No symbol table info available.
#5  0x000055622e016182 in __msan_warning_with_origin_noreturn ()
No symbol table info available.
#6  0x00007fa6d35db565 in getReg () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:1353
No locals.
#7  0x00007fa6d35dbbef in setInfoBasedOnIPRegister ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:2561
No locals.
#8  0x00007fa6d35d78fa in __unw_init_local () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/libunwind.cpp:91
No locals.
#9  0x00007fa6d35eb71d in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:137
No locals.
#10 0x000055622e00ebcb in __sanitizer::BufferedStackTrace::UnwindSlow(unsigned long, unsigned int) ()
No symbol table info available.
#11 0x000055622e00819d in __sanitizer::BufferedStackTrace::Unwind(unsigned int, unsigned long, unsigned long, void*, unsigned long, unsigned long, bool) ()
No symbol table info available.
#12 0x000055622e015c95 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int) ()
No symbol table info available.
#13 0x000055622e0158bf in __msan::PrintWarningWithOrigin(unsigned long, unsigned long, unsigned int) ()
No symbol table info available.
#14 0x000055622e016182 in __msan_warning_with_origin_noreturn ()
No symbol table info available.
#15 0x00007fa6d35db565 in getReg () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:1353
No locals.
#16 0x00007fa6d35dbbef in setInfoBasedOnIPRegister ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindCursor.hpp:2561
No locals.
#17 0x00007fa6d35d78fa in __unw_init_local () at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/libunwind.cpp:91
No locals.
#18 0x00007fa6d35eb71d in _Unwind_Backtrace ()
    at /msan-build/llvm-toolchain-20-20.1.0~++20250221063039+dc1bd6a8fa6a/libunwind/src/UnwindLevel1-gcc-ext.c:137

Remove libunwind library and re-run and this executes sucessfully:

rm  /msan-libs/libunwind.so.1

buildbot@2f11790aa1e1:/build$ ldd ./storage/maria/unittest/ma_pagecache_consist_1kRD-t
	linux-vdso.so.1 (0x00007fe94383c000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe93dd25000)
	libssl.so.3 => /msan-libs/libssl.so.3 (0x00007fe93dc17000)
	libcrypto.so.3 => /msan-libs/libcrypto.so.3 (0x00007fe93d68b000)
	libc++.so.1 => /msan-libs/libc++.so.1 (0x00007fe93d3fb000)
	libc++abi.so.1 => /msan-libs/libc++abi.so.1 (0x00007fe93d336000)
	libunwind.so.1 => /lib/x86_64-linux-gnu/libunwind.so.1 (0x00007fe93d326000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fe93d315000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe93d2f5000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe93d114000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fe94383e000)
buildbot@2f11790aa1e1:/build$ ./storage/maria/unittest/ma_pagecache_consist_1kRD-t
1..6
ok 1 - reader3: done
ok 2 - reader2: done
ok 3 - reader5: done
ok 4 - reader1: done
ok 5 - reader4: done
ok 6 - writer1: done
Test took 0.32 sec

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions