Skip to content

Rust 1.54.0 optimized compilation overwrites stack variable, causing segfault #87947

Closed
@indygreg

Description

@indygreg

I am able to reliably reproduce a crash on x86_64-unknown-linux-gnu when building in --release mode with 1.54.0. The same code and build configuration works properly on 1.53.0.

I'm not a great assembly-level debugger, but I believe the compiler is confused about register/variable aliasing and this effectively leads to corruption of a variable on the stack, which leads to a segfault.

Rust 1.54.0 Analysis

We're about to return from _PyArgv_AsWstrList (from the CPython 3.9 internals):

105     in Python/preconfig.c
   0x0000555555ae1456 <+54>:    mov    %r14,%rdi
   0x0000555555ae1459 <+57>:    call   0x555555ad9020 <_PyWideStringList_Copy>
   0x0000555555ae145e <+62>:    test   %eax,%eax
   0x0000555555ae1460 <+64>:    js     0x555555ae14dc <_PyArgv_AsWstrList+188>

106     in Python/preconfig.c
107     in Python/preconfig.c
108     in Python/preconfig.c
109     in Python/preconfig.c
=> 0x0000555555ae1462 <+66>:    xorps  %xmm0,%xmm0
   0x0000555555ae1465 <+69>:    movups %xmm0,0x10(%r15)
   0x0000555555ae146a <+74>:    movups %xmm0,(%r15)

110     in Python/preconfig.c
   0x0000555555ae146e <+78>:    mov    %r15,%rax
   0x0000555555ae1471 <+81>:    add    $0x20,%rsp
   0x0000555555ae1475 <+85>:    pop    %rbx
   0x0000555555ae1476 <+86>:    pop    %r12
   0x0000555555ae1478 <+88>:    pop    %r13
   0x0000555555ae147a <+90>:    pop    %r14
   0x0000555555ae147c <+92>:    pop    %r15
   0x0000555555ae147e <+94>:    ret

(gdb) info registers
rax            0x5555559e8ae0      93824997034720
rbx            0x7fffffffc530      140737488340272
rcx            0x7ffff7808008      140737345781768
rdx            0x0                 0
rsi            0x0                 0
rdi            0x0                 0
rbp            0x1                 0x1
rsp            0x7fffffffc4c0      0x7fffffffc4c0
r8             0x1                 1
r9             0x0                 0
r10            0x0                 0
r11            0x7ffff7809000      140737345785856
r12            0x7fffffffc4d8      140737488340184
r13            0x1                 1
r14            0x7fffffffc848      140737488341064
r15            0x7fffffffc600      140737488340480
rip            0x555555ae1462      0x555555ae1462 <_PyArgv_AsWstrList+66>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

And here's the last few frame pointers.

(gdb) info frame 0
Stack frame at 0x7fffffffc510:
 rip = 0x555555ae1462 in _PyArgv_AsWstrList (Python/preconfig.c:109); saved rip = 0x5555558733f2
 called by frame at 0x7fffffffc570
 source language c.
 Arglist at 0x7fffffffc4b8, args: args=0x7fffffffc530, list=0x7fffffffc848
 Locals at 0x7fffffffc4b8, Previous frame's sp is 0x7fffffffc510
 Saved registers:
  rbx at 0x7fffffffc4e0, r12 at 0x7fffffffc4e8, r13 at 0x7fffffffc4f0, r14 at 0x7fffffffc4f8, r15 at 0x7fffffffc500, rip at 0x7fffffffc508

(gdb) info frame 1
Stack frame at 0x7fffffffc570:
 rip = 0x5555558733f2 in _PyConfig_SetPyArgv (./Python/initconfig.c:2448); saved rip = 0x555555a4c92c
 inlined into frame 2, caller of frame at 0x7fffffffc510
 source language c.
 Arglist at unknown address.
 Locals at unknown address, Previous frame's sp is 0x7fffffffc510
 Saved registers:
  rbx at 0x7fffffffc4e0, r12 at 0x7fffffffc4e8, r13 at 0x7fffffffc4f0, r14 at 0x7fffffffc4f8, r15 at 0x7fffffffc500, rip at 0x7fffffffc508

(gdb) info frame 2
Stack frame at 0x7fffffffc570:
 rip = 0x5555558733f2 in PyConfig_SetBytesArgv (./Python/initconfig.c:2462); saved rip = 0x555555a4c92c
 called by frame at 0x7fffffffc650, caller of frame at 0x7fffffffc570
 source language c.
 Arglist at 0x7fffffffc508, args: config=<optimized out>, argc=<optimized out>, argv=<optimized out>
 Locals at 0x7fffffffc508, Previous frame's sp is 0x7fffffffc570
 Saved registers:
  rbx at 0x7fffffffc550, r14 at 0x7fffffffc558, r15 at 0x7fffffffc560, rip at 0x7fffffffc568

(gdb) info frame 3
Stack frame at 0x7fffffffc650:
 rip = 0x555555a4c92c in pyembed::interpreter_config::set_argv (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:215);
    saved rip = 0x555555a04e1d
 called by frame at 0x7fffffffcb20, caller of frame at 0x7fffffffc570
 source language rust.
 Arglist at 0x7fffffffc570, args: config=0x0, args=...
 Locals at 0x7fffffffc570, Previous frame's sp is 0x7fffffffc650
 Saved registers:
  rbx at 0x7fffffffc618, rbp at 0x7fffffffc640, r12 at 0x7fffffffc620, r13 at 0x7fffffffc628, r14 at 0x7fffffffc630, r15 at 0x7fffffffc638,
  rip at 0x7fffffffc648

(gdb) info frame 4
Stack frame at 0x7fffffffcb20:
 rip = 0x555555a04e1d in pyembed::interpreter_config::{{impl}}::try_into (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592);
    saved rip = 0x555555a49736
 called by frame at 0x7fffffffd9e0, caller of frame at 0x7fffffffc650
 source language rust.
 Arglist at 0x7fffffffc648, args: self=0x7fffffffcff0
 Locals at 0x7fffffffc648, Previous frame's sp is 0x7fffffffcb20
 Saved registers:
  rbx at 0x7fffffffcaf8, r12 at 0x7fffffffcb00, r14 at 0x7fffffffcb08, r15 at 0x7fffffffcb10, rip at 0x7fffffffcb18

(gdb) info frame 5
Stack frame at 0x7fffffffd9e0:
 rip = 0x555555a49736 in pyembed::interpreter::MainPythonInterpreter::init (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter.rs:169);
    saved rip = 0x5555559dde78
 inlined into frame 6, caller of frame at 0x7fffffffcb20
 source language rust.
 Arglist at unknown address.
 Locals at unknown address, Previous frame's sp is 0x7fffffffcb20
 Saved registers:
  rbx at 0x7fffffffcaf8, r12 at 0x7fffffffcb00, r14 at 0x7fffffffcb08, r15 at 0x7fffffffcb10, rip at 0x7fffffffcb18

We eventually crash in frame 4. So let's look at it's state:

(gdb) f 4
#4  0x0000555555a04e1d in pyembed::interpreter_config::{{impl}}::try_into (self=0x7fffffffcff0)
    at /home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592
592                 set_argv(&mut config, argv)?;
(gdb) info registers
rax            0x5555559e8ae0      93824997034720
rbx            0x7fffffffcff0      140737488343024
rcx            0x7ffff7808008      140737345781768
rdx            0x0                 0
rsi            0x0                 0
rdi            0x0                 0
rbp            0x7fffffffd308      0x7fffffffd308
rsp            0x7fffffffc650      0x7fffffffc650
r8             0x1                 1
r9             0x0                 0
r10            0x0                 0
r11            0x7ffff7809000      140737345785856
r12            0x7ffff7d9ff40      140737351647040
r13            0x0                 0
r14            0x7fffffffcc70      140737488342128
r15            0x7fffffffc970      140737488341360
rip            0x555555a04e1d      0x555555a04e1d <pyembed::interpreter_config::{{impl}}::try_into+173>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

Here are the important details:

The instructions we're about to execute are:

0x0000555555ae1462 <+66>:    xorps  %xmm0,%xmm0
0x0000555555ae1465 <+69>:    movups %xmm0,0x10(%r15)
0x0000555555ae146a <+74>:    movups %xmm0,(%r15)

self in frame 4 is 0x7fffffffcff0, which appears to be inhabiting rbx.

The value of r15 is 0x7fffffffc600. This memory address is close to frame 4's arglist and locals (0x7fffffffc648). The value returned by this function is propagating through the intermediate frames to frame 4, which operates on it. So the compiler likely optimized things to a variable write directly into frame 4's locals space. Cool.

Before we execute that movups %xmm0,0x10(%r15), here is the memory we're operating on:

0x7fffffffc600: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffffffc610: 0x00000000      0x00000000      0xffffcff0      0x00007fff
0x7fffffffc620: 0xf7d9ff40      0x00007fff      0x00000000      0x00000000
0x7fffffffc630: 0xffffcc70      0x00007fff      0xffffc970      0x00007fff
0x7fffffffc640: 0xffffd308      0x00007fff      0x55a04e1d      0x00005555
0x7fffffffc650: 0xffffc8d0      0x00007fff      0x00000000      0x00000000
0x7fffffffc660: 0x00000003      0x00000001      0x00000000      0x00000000

And after:

0x7fffffffc600: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffffffc610: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffffffc620: 0xf7d9ff40      0x00007fff      0x00000000      0x00000000
0x7fffffffc630: 0xffffcc70      0x00007fff      0xffffc970      0x00007fff
0x7fffffffc640: 0xffffd308      0x00007fff      0x55a04e1d      0x00005555
0x7fffffffc650: 0xffffc8d0      0x00007fff      0x00000000      0x00000000
0x7fffffffc660: 0x00000003      0x00000001      0x00000000      0x00000000

The change there is:

 0x7fffffffc600: 0x00000000      0x00000000      0x00000000      0x00000000
-0x7fffffffc610: 0x00000000      0x00000000      0xffffcff0      0x00007fff
+0x7fffffffc610: 0x00000000      0x00000000      0x00000000      0x00000000
 0x7fffffffc620: 0xf7d9ff40      0x00007fff      0x00000000      0x00000000

So the 16 bytes between 0x7fffffffc610-0x7fffffffc61f got cleared out. Makes sense: that's what the instructions told it to do.

However, 0x7fffffffc618 contained the saved register value (0x7fffffffcff0) for rbx, holding the address of self. This value is now zeroed.

We confirm this from GDB:

(gdb) f 4
#4  0x0000555555a04e1d in pyembed::interpreter_config::{{impl}}::try_into (self=0x0)
    at /home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592
592                 set_argv(&mut config, argv)?;
(gdb) info registers
rax            0x5555559e8ae0      93824997034720
rbx            0x0                 0
rcx            0x7ffff7808008      140737345781768
rdx            0x0                 0

Note that rbx is 0x0 and self=0x0.

Several instructions later, we return to frame 4 and execute assembly corresponding to the Rust code if self.exe.is_none() {. Since self is NULL, we get a segfault.

Rust 1.53.0 Analysis

Compiling the same source code with Rust 1.53.0, things look similar:

105     in Python/preconfig.c
   0x0000555555ae0886 <+54>:    mov    %r14,%rdi
   0x0000555555ae0889 <+57>:    call   0x555555ad8450 <_PyWideStringList_Copy>
   0x0000555555ae088e <+62>:    test   %eax,%eax
   0x0000555555ae0890 <+64>:    js     0x555555ae090c <_PyArgv_AsWstrList+188>

106     in Python/preconfig.c
107     in Python/preconfig.c
108     in Python/preconfig.c
109     in Python/preconfig.c
=> 0x0000555555ae0892 <+66>:    xorps  %xmm0,%xmm0
   0x0000555555ae0895 <+69>:    movups %xmm0,0x10(%r15)
   0x0000555555ae089a <+74>:    movups %xmm0,(%r15)

110     in Python/preconfig.c
   0x0000555555ae089e <+78>:    mov    %r15,%rax
   0x0000555555ae08a1 <+81>:    add    $0x20,%rsp
   0x0000555555ae08a5 <+85>:    pop    %rbx
   0x0000555555ae08a6 <+86>:    pop    %r12
   0x0000555555ae08a8 <+88>:    pop    %r13
   0x0000555555ae08aa <+90>:    pop    %r14
   0x0000555555ae08ac <+92>:    pop    %r15
   0x0000555555ae08ae <+94>:    ret

(gdb) info frame 0
Stack frame at 0x7fffffffc410:
 rip = 0x555555ae0892 in _PyArgv_AsWstrList (Python/preconfig.c:109); saved rip = 0x5555558724a2
 called by frame at 0x7fffffffc470
 source language c.
 Arglist at 0x7fffffffc3b8, args: args=0x7fffffffc430, list=0x7fffffffc8e0
 Locals at 0x7fffffffc3b8, Previous frame's sp is 0x7fffffffc410
 Saved registers:
  rbx at 0x7fffffffc3e0, r12 at 0x7fffffffc3e8, r13 at 0x7fffffffc3f0, r14 at 0x7fffffffc3f8, r15 at 0x7fffffffc400, rip at 0x7fffffffc408

(gdb) info frame 1
Stack frame at 0x7fffffffc470:
 rip = 0x5555558724a2 in _PyConfig_SetPyArgv (./Python/initconfig.c:2448); saved rip = 0x555555a42a6c
 inlined into frame 2, caller of frame at 0x7fffffffc410
 source language c.
 Arglist at unknown address.
 Locals at unknown address, Previous frame's sp is 0x7fffffffc410
 Saved registers:
  rbx at 0x7fffffffc3e0, r12 at 0x7fffffffc3e8, r13 at 0x7fffffffc3f0, r14 at 0x7fffffffc3f8, r15 at 0x7fffffffc400, rip at 0x7fffffffc408

(gdb) info frame 2
Stack frame at 0x7fffffffc470:
 rip = 0x5555558724a2 in PyConfig_SetBytesArgv (./Python/initconfig.c:2462); saved rip = 0x555555a42a6c
 called by frame at 0x7fffffffc550, caller of frame at 0x7fffffffc470
 source language c.
 Arglist at 0x7fffffffc408, args: config=<optimized out>, argc=<optimized out>, argv=<optimized out>
 Locals at 0x7fffffffc408, Previous frame's sp is 0x7fffffffc470
 Saved registers:
  rbx at 0x7fffffffc450, r14 at 0x7fffffffc458, r15 at 0x7fffffffc460, rip at 0x7fffffffc468

(gdb) info frame 3
Stack frame at 0x7fffffffc550:
 rip = 0x555555a42a6c in pyembed::interpreter_config::set_argv (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:215);
    saved rip = 0x5555559fe846
 called by frame at 0x7fffffffca30, caller of frame at 0x7fffffffc470
 source language rust.
 Arglist at 0x7fffffffc470, args: config=0x0, args=...
 Locals at 0x7fffffffc470, Previous frame's sp is 0x7fffffffc550
 Saved registers:
  rbx at 0x7fffffffc518, rbp at 0x7fffffffc540, r12 at 0x7fffffffc520, r13 at 0x7fffffffc528, r14 at 0x7fffffffc530, r15 at 0x7fffffffc538,
  rip at 0x7fffffffc548

(gdb) info frame 4
Stack frame at 0x7fffffffca30:
 rip = 0x5555559fe846 in pyembed::interpreter_config::{{impl}}::try_into (/home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592);
    saved rip = 0x555555a36692
 called by frame at 0x7fffffffd9e0, caller of frame at 0x7fffffffc550
 source language rust.
 Arglist at 0x7fffffffc548, args: self=0x7fffffffcff0
 Locals at 0x7fffffffc548, Previous frame's sp is 0x7fffffffca30
 Saved registers:
  rbx at 0x7fffffffca08, r12 at 0x7fffffffca10, r14 at 0x7fffffffca18, r15 at 0x7fffffffca20, rip at 0x7fffffffca28
(gdb) info registers
rax            0x5555559e70a0      93824997028000
rbx            0x7fffffffc430      140737488340016
rcx            0x7ffff7808008      140737345781768
rdx            0x0                 0
rsi            0x0                 0
rdi            0x0                 0
rbp            0x1                 0x1
rsp            0x7fffffffc3c0      0x7fffffffc3c0
r8             0x1                 1
r9             0x0                 0
r10            0x0                 0
r11            0x7ffff7809000      140737345785856
r12            0x7fffffffc3d8      140737488339928
r13            0x1                 1
r14            0x7fffffffc8e0      140737488341216
r15            0x7fffffffc500      140737488340224
rip            0x555555ae0892      0x555555ae0892 <_PyArgv_AsWstrList+66>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
(gdb) frame 4
#4  0x00005555559fe846 in pyembed::interpreter_config::{{impl}}::try_into (self=0x7fffffffcff0)
    at /home/gps/src/pyoxidizer.git/pyembed/src/interpreter_config.rs:592
592                 set_argv(&mut config, argv)?;

(gdb) info registers
rax            0x5555559e70a0      93824997028000
rbx            0x0                 0
rcx            0x7ffff7808008      140737345781768
rdx            0x0                 0
rsi            0x0                 0
rdi            0x0                 0
rbp            0x7fffffffd308      0x7fffffffd308
rsp            0x7fffffffc550      0x7fffffffc550
r8             0x1                 1
r9             0x0                 0
r10            0x0                 0
r11            0x7ffff7809000      140737345785856
r12            0x7fffffffcff0      140737488343024
r13            0x0                 0
r14            0x7fffffffcb00      140737488341760
r15            0x7fffffffcff0      140737488343024
rip            0x5555559fe846      0x5555559fe846 <pyembed::interpreter_config::{{impl}}::try_into+166>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

This time r15 is 0x7fffffffc500.

And frame 4's arglist is at 0x7fffffffc548. The offsets are the same here. So the assembly changes the same relative bytes.

But, the starting bytes that are nulled out start out as NULL, so the movups is effectively a no-op:

0x7fffffffc4f0: 0x00000001      0x00000000      0x00000001      0x00000000
0x7fffffffc500: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffffffc510: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffffffc520: 0xffffcff0      0x00007fff      0x00000000      0x00000000
0x7fffffffc530: 0xffffcb00      0x00007fff      0xffffcff0      0x00007fff
0x7fffffffc540: 0xffffd308      0x00007fff      0x559fe846      0x00005555
0x7fffffffc550: 0x00000038      0x00000000      0x00000000      0x00000000

However, this version does not crash because self is not overwritten because its address isn't stored in rbx. The address of self (0x7fffffffcff0) is stored in r12 and r15. (I'm unsure how the memory address for self is calculated here.)

Hypothesis

This smells like a compiler optimization bug. Rust 1.54.0 seems to be emitting assembly that aliases 2 variables/registers to the same memory address.

Steps to Reproduce

git clone https://github.com/indygreg/PyOxidizer.git
cd PyOxidizer
git checkout rust-crash
cargo run --bin pyoxidizer -- init-rust-project ~/tmp/crash
cd ~/tmp/crash

cat > Cargo.toml <<EOF
[profile.release]
debug = true
EOF

cargo run --release --features allocator-jemalloc

In a debugger, try setting a breakpoint at preconfig.c:109. This should stop just before self gets overwritten.

The Rust function call from the crashing frame is https://github.com/indygreg/PyOxidizer/blob/0ca3236bac944b63ea8506f273b064167e25f47b/pyembed/src/interpreter_config.rs#L592. This calls into another Rust function which calls into CPython C APIs. The crash occurs at https://github.com/indygreg/PyOxidizer/blob/0ca3236bac944b63ea8506f273b064167e25f47b/pyembed/src/interpreter_config.rs#L595 when dereferencing a NULL self.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.I-crashIssue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.regression-from-stable-to-stablePerformance or correctness regression from one stable version to another.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions