Description
Meta
Tested on Ubuntu 24.04 and Amazon Linux 2, x86_64.
Workaround to the production problem
In your Cargo.toml that is compiling your Python module, set
[profile.release]
debug = 0
lto = false
This prevents the 1.84 crashes.
However, there is still UB going on even with this setting.
The Production Problem
The crate pyo3-log installs a bridge that makes log functions call into Python. This means that all calls to logging::info!
etc will take the GIL.
Python has a stage during interpreter shutdown where attempts to take the GIL will cause a pthread_exit
. Python 3.14 (still Alpha today, targeted to be released by the end of this year) will change this in python/cpython#87135 - but that will take some time to reach people.
This means that if you have a Python program that uses a Rust library and pyo3-log, that spawning a Rust thread, that is calling logging::info!
in a way unsynchronized with interpreter exit, you'll have unpredicatable crashes in 1.84.
Minified Program
This program:
use std::ffi::c_void;
extern "C" {
fn pthread_exit(retval: *const c_void);
}
fn main() {
std::thread::spawn(|| {
unsafe { pthread_exit(std::ptr::null()); }
});
std::thread::sleep(std::time::Duration::from_secs(1));
}
when compiled with the following options
rustc +1.84 d.rs -Cpanic=abort -Cdebuginfo=limited
crashes with this confusing error
thread '<unnamed>' panicked at core/src/panicking.rs:223:5:
panic in a function that cannot unwind
stack backtrace:
0: rust_begin_unwind
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/panicking.rs:665:5
1: core::panicking::panic_nounwind_fmt::runtime
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/panicking.rs:119:22
2: core::panicking::panic_nounwind_fmt
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/intrinsics/mod.rs:3535:9
3: core::panicking::panic_nounwind
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/panicking.rs:223:5
4: core::panicking::panic_cannot_unwind
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/panicking.rs:315:5
5: std::sys::pal::unix::thread::Thread::new::thread_start
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/sys/pal/unix/thread.rs:99:9
6: start_thread
7: clone
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread caused non-unwinding panic. aborting.
Aborted
This crash happens:
- Only on 1.84, not on 1.83
- Only when debuginfo is enabled, but even if the binary is stripped.
When using -C panic=unwind
instead, on all versions of the compiler, you get this error:
FATAL: exception not rethrown
Aborted (core dumped)
I seen the claim in Zulip (https://rust-lang.zulipchat.com/#narrow/channel/122651-general/topic/pthread_exit.20from.20a.20Rust-spawned.20thread) that this is undefined behavior, but I'll rather not break pyo3-log