Description
I am able to reliably reproduce a crash on x86_64-unknown-linux-gnu when building in --release
mode with 1.54.0. The same code and build configuration works properly on 1.53.0.
I'm not a great assembly-level debugger, but I believe the compiler is confused about register/variable aliasing and this effectively leads to corruption of a variable on the stack, which leads to a segfault.
Rust 1.54.0 Analysis
We're about to return from _PyArgv_AsWstrList
(from the CPython 3.9 internals):
105 in Python/preconfig.c
0x0000555555ae1456 <+54>: mov %r14,%rdi
0x0000555555ae1459 <+57>: call 0x555555ad9020 <_PyWideStringList_Copy>
0x0000555555ae145e <+62>: test %eax,%eax
0x0000555555ae1460 <+64>: js 0x555555ae14dc <_PyArgv_AsWstrList+188>
106 in Python/preconfig.c
107 in Python/preconfig.c
108 in Python/preconfig.c
109 in Python/preconfig.c
=> 0x0000555555ae1462 <+66>: xorps %xmm0,%xmm0
0x0000555555ae1465 <+69>: movups %xmm0,0x10(%r15)
0x0000555555ae146a <+74>: movups %xmm0,(%r15)
110 in Python/preconfig.c
0x0000555555ae146e <+78>: mov %r15,%rax
0x0000555555ae1471 <+81>: add $0x20,%rsp
0x0000555555ae1475 <+85>: pop %rbx
0x0000555555ae1476 <+86>: pop %r12
0x0000555555ae1478 <+88>: pop %r13
0x0000555555ae147a <+90>: pop %r14
0x0000555555ae147c <+92>: pop %r15
0x0000555555ae147e <+94>: ret
(gdb) info registers
rax 0x5555559e8ae0 93824997034720
rbx 0x7fffffffc530 140737488340272
rcx 0x7ffff7808008 140737345781768
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x1 0x1
rsp 0x7fffffffc4c0 0x7fffffffc4c0
r8 0x1 1
r9 0x0 0
r10 0x0 0
r11 0x7ffff7809000 140737345785856
r12 0x7fffffffc4d8 140737488340184
r13 0x1 1
r14 0x7fffffffc848 140737488341064
r15 0x7fffffffc600 140737488340480
rip 0x555555ae1462 0x555555ae1462 <_PyArgv_AsWstrList+66>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
And here's the last few frame pointers.
(gdb) info frame 0
Stack frame at 0x7fffffffc510:
rip = 0x555555ae1462 in _PyArgv_AsWstrList (Python/preconfig.c:109); saved rip = 0x5555558733f2
called by frame at 0x7fffffffc570
source language c.
Arglist at 0x7fffffffc4b8, args: args=0x7fffffffc530, list=0x7fffffffc848
Locals at 0x7fffffffc4b8, Previous frame's sp is 0x7fffffffc510
Saved registers:
rbx at 0x7fffffffc4e0, r12 at 0x7fffffffc4e8, r13 at 0x7fffffffc4f0, r14 at 0x7fffffffc4f8, r15 at 0x7fffffffc500, rip at 0x7fffffffc508
(gdb) info frame 1
Stack frame at 0x7fffffffc570:
rip = 0x5555558733f2 in _PyConfig_SetPyArgv (./Python/initconfig.c:2448); saved rip = 0x555555a4c92c
inlined into frame 2, caller of frame at 0x7fffffffc510
source language c.
Arglist at unknown address.
Locals at unknown address, Previous frame's sp is 0x7fffffffc510
Saved registers:
rbx at 0x7fffffffc4e0, r12 at 0x7fffffffc4e8, r13 at 0x7fffffffc4f0, r14 at 0x7fffffffc4f8, r15 at 0x7fffffffc500, rip at 0x7fffffffc508
(gdb) info frame 2
Stack frame at 0x7fffffffc570:
rip = 0x5555558733f2 in PyConfig_SetBytesArgv (./Python/initconfig.c:2462); saved rip = 0x555555a4c92c
called by frame at 0x7fffffffc650, caller of frame at 0x7fffffffc570
source language c.
Arglist at 0x7fffffffc508, args: config=<optimized out>, argc=<optimized out>, argv=<optimized out>
Locals at 0x7fffffffc508, Previous frame's sp is 0x7fffffffc570
Saved registers:
rbx at 0x7fffffffc550, r14 at 0x7fffffffc558, r15 at 0x7fffffffc560, rip at 0x7fffffffc568
(gdb) info frame 3
Stack frame at 0x7fffffffc650:
rip = 0x555555a4c92c in pyembed::interpreter_config::set_argv (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:215);
saved rip = 0x555555a04e1d
called by frame at 0x7fffffffcb20, caller of frame at 0x7fffffffc570
source language rust.
Arglist at 0x7fffffffc570, args: config=0x0, args=...
Locals at 0x7fffffffc570, Previous frame's sp is 0x7fffffffc650
Saved registers:
rbx at 0x7fffffffc618, rbp at 0x7fffffffc640, r12 at 0x7fffffffc620, r13 at 0x7fffffffc628, r14 at 0x7fffffffc630, r15 at 0x7fffffffc638,
rip at 0x7fffffffc648
(gdb) info frame 4
Stack frame at 0x7fffffffcb20:
rip = 0x555555a04e1d in pyembed::interpreter_config::{{impl}}::try_into (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592);
saved rip = 0x555555a49736
called by frame at 0x7fffffffd9e0, caller of frame at 0x7fffffffc650
source language rust.
Arglist at 0x7fffffffc648, args: self=0x7fffffffcff0
Locals at 0x7fffffffc648, Previous frame's sp is 0x7fffffffcb20
Saved registers:
rbx at 0x7fffffffcaf8, r12 at 0x7fffffffcb00, r14 at 0x7fffffffcb08, r15 at 0x7fffffffcb10, rip at 0x7fffffffcb18
(gdb) info frame 5
Stack frame at 0x7fffffffd9e0:
rip = 0x555555a49736 in pyembed::interpreter::MainPythonInterpreter::init (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter.rs:169);
saved rip = 0x5555559dde78
inlined into frame 6, caller of frame at 0x7fffffffcb20
source language rust.
Arglist at unknown address.
Locals at unknown address, Previous frame's sp is 0x7fffffffcb20
Saved registers:
rbx at 0x7fffffffcaf8, r12 at 0x7fffffffcb00, r14 at 0x7fffffffcb08, r15 at 0x7fffffffcb10, rip at 0x7fffffffcb18
We eventually crash in frame 4. So let's look at it's state:
(gdb) f 4
#4 0x0000555555a04e1d in pyembed::interpreter_config::{{impl}}::try_into (self=0x7fffffffcff0)
at /home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592
592 set_argv(&mut config, argv)?;
(gdb) info registers
rax 0x5555559e8ae0 93824997034720
rbx 0x7fffffffcff0 140737488343024
rcx 0x7ffff7808008 140737345781768
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x7fffffffd308 0x7fffffffd308
rsp 0x7fffffffc650 0x7fffffffc650
r8 0x1 1
r9 0x0 0
r10 0x0 0
r11 0x7ffff7809000 140737345785856
r12 0x7ffff7d9ff40 140737351647040
r13 0x0 0
r14 0x7fffffffcc70 140737488342128
r15 0x7fffffffc970 140737488341360
rip 0x555555a04e1d 0x555555a04e1d <pyembed::interpreter_config::{{impl}}::try_into+173>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
Here are the important details:
The instructions we're about to execute are:
0x0000555555ae1462 <+66>: xorps %xmm0,%xmm0
0x0000555555ae1465 <+69>: movups %xmm0,0x10(%r15)
0x0000555555ae146a <+74>: movups %xmm0,(%r15)
self
in frame 4 is 0x7fffffffcff0
, which appears to be inhabiting rbx
.
The value of r15
is 0x7fffffffc600
. This memory address is close to frame 4's arglist and locals (0x7fffffffc648
). The value returned by this function is propagating through the intermediate frames to frame 4, which operates on it. So the compiler likely optimized things to a variable write directly into frame 4's locals space. Cool.
Before we execute that movups %xmm0,0x10(%r15)
, here is the memory we're operating on:
0x7fffffffc600: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffc610: 0x00000000 0x00000000 0xffffcff0 0x00007fff
0x7fffffffc620: 0xf7d9ff40 0x00007fff 0x00000000 0x00000000
0x7fffffffc630: 0xffffcc70 0x00007fff 0xffffc970 0x00007fff
0x7fffffffc640: 0xffffd308 0x00007fff 0x55a04e1d 0x00005555
0x7fffffffc650: 0xffffc8d0 0x00007fff 0x00000000 0x00000000
0x7fffffffc660: 0x00000003 0x00000001 0x00000000 0x00000000
And after:
0x7fffffffc600: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffc610: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffc620: 0xf7d9ff40 0x00007fff 0x00000000 0x00000000
0x7fffffffc630: 0xffffcc70 0x00007fff 0xffffc970 0x00007fff
0x7fffffffc640: 0xffffd308 0x00007fff 0x55a04e1d 0x00005555
0x7fffffffc650: 0xffffc8d0 0x00007fff 0x00000000 0x00000000
0x7fffffffc660: 0x00000003 0x00000001 0x00000000 0x00000000
The change there is:
0x7fffffffc600: 0x00000000 0x00000000 0x00000000 0x00000000
-0x7fffffffc610: 0x00000000 0x00000000 0xffffcff0 0x00007fff
+0x7fffffffc610: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffc620: 0xf7d9ff40 0x00007fff 0x00000000 0x00000000
So the 16 bytes between 0x7fffffffc610-0x7fffffffc61f
got cleared out. Makes sense: that's what the instructions told it to do.
However, 0x7fffffffc618
contained the saved register value (0x7fffffffcff0
) for rbx
, holding the address of self
. This value is now zeroed.
We confirm this from GDB:
(gdb) f 4
#4 0x0000555555a04e1d in pyembed::interpreter_config::{{impl}}::try_into (self=0x0)
at /home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592
592 set_argv(&mut config, argv)?;
(gdb) info registers
rax 0x5555559e8ae0 93824997034720
rbx 0x0 0
rcx 0x7ffff7808008 140737345781768
rdx 0x0 0
Note that rbx
is 0x0
and self=0x0
.
Several instructions later, we return to frame 4 and execute assembly corresponding to the Rust code if self.exe.is_none() {
. Since self
is NULL, we get a segfault.
Rust 1.53.0 Analysis
Compiling the same source code with Rust 1.53.0, things look similar:
105 in Python/preconfig.c
0x0000555555ae0886 <+54>: mov %r14,%rdi
0x0000555555ae0889 <+57>: call 0x555555ad8450 <_PyWideStringList_Copy>
0x0000555555ae088e <+62>: test %eax,%eax
0x0000555555ae0890 <+64>: js 0x555555ae090c <_PyArgv_AsWstrList+188>
106 in Python/preconfig.c
107 in Python/preconfig.c
108 in Python/preconfig.c
109 in Python/preconfig.c
=> 0x0000555555ae0892 <+66>: xorps %xmm0,%xmm0
0x0000555555ae0895 <+69>: movups %xmm0,0x10(%r15)
0x0000555555ae089a <+74>: movups %xmm0,(%r15)
110 in Python/preconfig.c
0x0000555555ae089e <+78>: mov %r15,%rax
0x0000555555ae08a1 <+81>: add $0x20,%rsp
0x0000555555ae08a5 <+85>: pop %rbx
0x0000555555ae08a6 <+86>: pop %r12
0x0000555555ae08a8 <+88>: pop %r13
0x0000555555ae08aa <+90>: pop %r14
0x0000555555ae08ac <+92>: pop %r15
0x0000555555ae08ae <+94>: ret
(gdb) info frame 0
Stack frame at 0x7fffffffc410:
rip = 0x555555ae0892 in _PyArgv_AsWstrList (Python/preconfig.c:109); saved rip = 0x5555558724a2
called by frame at 0x7fffffffc470
source language c.
Arglist at 0x7fffffffc3b8, args: args=0x7fffffffc430, list=0x7fffffffc8e0
Locals at 0x7fffffffc3b8, Previous frame's sp is 0x7fffffffc410
Saved registers:
rbx at 0x7fffffffc3e0, r12 at 0x7fffffffc3e8, r13 at 0x7fffffffc3f0, r14 at 0x7fffffffc3f8, r15 at 0x7fffffffc400, rip at 0x7fffffffc408
(gdb) info frame 1
Stack frame at 0x7fffffffc470:
rip = 0x5555558724a2 in _PyConfig_SetPyArgv (./Python/initconfig.c:2448); saved rip = 0x555555a42a6c
inlined into frame 2, caller of frame at 0x7fffffffc410
source language c.
Arglist at unknown address.
Locals at unknown address, Previous frame's sp is 0x7fffffffc410
Saved registers:
rbx at 0x7fffffffc3e0, r12 at 0x7fffffffc3e8, r13 at 0x7fffffffc3f0, r14 at 0x7fffffffc3f8, r15 at 0x7fffffffc400, rip at 0x7fffffffc408
(gdb) info frame 2
Stack frame at 0x7fffffffc470:
rip = 0x5555558724a2 in PyConfig_SetBytesArgv (./Python/initconfig.c:2462); saved rip = 0x555555a42a6c
called by frame at 0x7fffffffc550, caller of frame at 0x7fffffffc470
source language c.
Arglist at 0x7fffffffc408, args: config=<optimized out>, argc=<optimized out>, argv=<optimized out>
Locals at 0x7fffffffc408, Previous frame's sp is 0x7fffffffc470
Saved registers:
rbx at 0x7fffffffc450, r14 at 0x7fffffffc458, r15 at 0x7fffffffc460, rip at 0x7fffffffc468
(gdb) info frame 3
Stack frame at 0x7fffffffc550:
rip = 0x555555a42a6c in pyembed::interpreter_config::set_argv (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:215);
saved rip = 0x5555559fe846
called by frame at 0x7fffffffca30, caller of frame at 0x7fffffffc470
source language rust.
Arglist at 0x7fffffffc470, args: config=0x0, args=...
Locals at 0x7fffffffc470, Previous frame's sp is 0x7fffffffc550
Saved registers:
rbx at 0x7fffffffc518, rbp at 0x7fffffffc540, r12 at 0x7fffffffc520, r13 at 0x7fffffffc528, r14 at 0x7fffffffc530, r15 at 0x7fffffffc538,
rip at 0x7fffffffc548
(gdb) info frame 4
Stack frame at 0x7fffffffca30:
rip = 0x5555559fe846 in pyembed::interpreter_config::{{impl}}::try_into (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592);
saved rip = 0x555555a36692
called by frame at 0x7fffffffd9e0, caller of frame at 0x7fffffffc550
source language rust.
Arglist at 0x7fffffffc548, args: self=0x7fffffffcff0
Locals at 0x7fffffffc548, Previous frame's sp is 0x7fffffffca30
Saved registers:
rbx at 0x7fffffffca08, r12 at 0x7fffffffca10, r14 at 0x7fffffffca18, r15 at 0x7fffffffca20, rip at 0x7fffffffca28
(gdb) info registers
rax 0x5555559e70a0 93824997028000
rbx 0x7fffffffc430 140737488340016
rcx 0x7ffff7808008 140737345781768
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x1 0x1
rsp 0x7fffffffc3c0 0x7fffffffc3c0
r8 0x1 1
r9 0x0 0
r10 0x0 0
r11 0x7ffff7809000 140737345785856
r12 0x7fffffffc3d8 140737488339928
r13 0x1 1
r14 0x7fffffffc8e0 140737488341216
r15 0x7fffffffc500 140737488340224
rip 0x555555ae0892 0x555555ae0892 <_PyArgv_AsWstrList+66>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) frame 4
#4 0x00005555559fe846 in pyembed::interpreter_config::{{impl}}::try_into (self=0x7fffffffcff0)
at /home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592
592 set_argv(&mut config, argv)?;
(gdb) info registers
rax 0x5555559e70a0 93824997028000
rbx 0x0 0
rcx 0x7ffff7808008 140737345781768
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x7fffffffd308 0x7fffffffd308
rsp 0x7fffffffc550 0x7fffffffc550
r8 0x1 1
r9 0x0 0
r10 0x0 0
r11 0x7ffff7809000 140737345785856
r12 0x7fffffffcff0 140737488343024
r13 0x0 0
r14 0x7fffffffcb00 140737488341760
r15 0x7fffffffcff0 140737488343024
rip 0x5555559fe846 0x5555559fe846 <pyembed::interpreter_config::{{impl}}::try_into+166>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
This time r15
is 0x7fffffffc500
.
And frame 4's arglist is at 0x7fffffffc548
. The offsets are the same here. So the assembly changes the same relative bytes.
But, the starting bytes that are nulled out start out as NULL, so the movups
is effectively a no-op:
0x7fffffffc4f0: 0x00000001 0x00000000 0x00000001 0x00000000
0x7fffffffc500: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffc510: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffc520: 0xffffcff0 0x00007fff 0x00000000 0x00000000
0x7fffffffc530: 0xffffcb00 0x00007fff 0xffffcff0 0x00007fff
0x7fffffffc540: 0xffffd308 0x00007fff 0x559fe846 0x00005555
0x7fffffffc550: 0x00000038 0x00000000 0x00000000 0x00000000
However, this version does not crash because self
is not overwritten because its address isn't stored in rbx
. The address of self
(0x7fffffffcff0
) is stored in r12
and r15
. (I'm unsure how the memory address for self
is calculated here.)
Hypothesis
This smells like a compiler optimization bug. Rust 1.54.0 seems to be emitting assembly that aliases 2 variables/registers to the same memory address.
Steps to Reproduce
git clone https://github.com/indygreg/PyOxidizer.git
cd PyOxidizer
git checkout rust-crash
cargo run --bin pyoxidizer -- init-rust-project ~/tmp/crash
cd ~/tmp/crash
cat > Cargo.toml <<EOF
[profile.release]
debug = true
EOF
cargo run --release --features allocator-jemalloc
In a debugger, try setting a breakpoint at preconfig.c:109
. This should stop just before self
gets overwritten.
The Rust function call from the crashing frame is https://github.com/indygreg/PyOxidizer/blob/0ca3236bac944b63ea8506f273b064167e25f47b/pyembed/src/interpreter_config.rs#L592. This calls into another Rust function which calls into CPython C APIs. The crash occurs at https://github.com/indygreg/PyOxidizer/blob/0ca3236bac944b63ea8506f273b064167e25f47b/pyembed/src/interpreter_config.rs#L595 when dereferencing a NULL self
.