Description
Currently in library/std/src/sys/unix/args.rs
, Rust is hooking into the glibc .init_array
extension for retrieving argc
/argv
even in case Rust does not manage main()
and is e.g. just loaded into a process as a cdylib
.
While this is great and convenient, it's unfortunately not implemented in a safe way. Various C commandline argument parsers are modifying argv
while parsing, so the information that gets passed into .init_array
might not be correct anymore.
Examples of such parsers are
- glibc's own
getopt
parser, noting " The default is to permute the contents of argv while scanning it so that eventually all the non-options are at the end. This allows options to be given in any order, even with programs that were not written to expect this. " - GLib's
GOptionContext
parser, which removes all already handled arguments fromargv
and only leaves others (removed ones are set toNULL
and moved to the end). - Qt does the same thing in
QCoreApplication
inQCoreApplicationPrivate::processCommandLineArguments()
. It shuffles around arguments inargv
and removes (by setting tonullptr
) arguments it handled already.
This means that at least argc
can easily be too big, and Rust would read beyond the (new) end of argv
. This caused crashes in practice because of calling strlen()
on NULL
when creating an CStr
around such an "removed" argument.
Now the question is how this should be handled in Rust. I see three options here
- Remove the
.init_array
extension usage and handle glibc like all other libcs - Copy all arguments in
.init_array
. This means everybody has to pay for that even if they don't use arguments, and this is theoretically still not safe if the shared library is loaded at the same time theargv
array is modified and a partially written pointer value is read - Handle
NULL
pointers inargv
by skipping over them. This means we would lose the exact length information of theargs
iterator, and this is still theoretically not safe for the same reason as 2. See Stop at the firstNULL
argument when iteratingargv
#106001
This code was added in 2019 by #66547 (CC @leo60228)
I'd be happy to provide a PR implementing either solution once there is some agreement how to move forward here. I've created #106001 as a proposed fix for this, which implements option 3.
Meta
rustc --version --verbose
:
rustc 1.66.0 (69f9c33d7 2022-12-12)
binary: rustc
commit-hash: 69f9c33d71c871fc16ac445211281c6e7a340943
commit-date: 2022-12-12
host: x86_64-unknown-linux-gnu
release: 1.66.0
LLVM version: 15.0.2
Backtrace
Program received signal SIGSEGV, Segmentation fault.
__strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:79
79 VPCMP $0, (%rdi), %YMMZERO, %k0
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-12.fc37.x86_64 elfutils-libelf-0.188-3.fc37.x86_64 elfutils-libs-0.188-3.fc37.x86_64 libblkid-2.38.1-1.fc37.x86_64 libmount-2.38.1-1.fc37.x86_64 libunwind-1.6.2-5.fc37.x86_64 libzstd-1.5.2-3.fc37.x86_64 pcre2-10.40-1.fc37.1.x86_64 xz-libs-5.2.5-10.fc37.x86_64 zlib-1.2.12-5.fc37.x86_64
(gdb) backtrace full
#0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:79
#1 0x00007ffff7713281 in core::ffi::c_str::CStr::from_ptr () at library/core/src/ffi/c_str.rs:286
#2 std::sys::unix::args::imp::clone::{closure#0} () at library/std/src/sys/unix/args.rs:146
#3 core::iter::adapters::map::map_fold::{closure#0}<isize, std::ffi::os_str::OsString, (), std::sys::unix::args::imp::clone::{closure_env#0}, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<std::ffi::os_str::OsString, alloc::vec::spec_extend::{impl#1}::spec_extend::{closure_env#0}<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>, alloc::alloc::Global>>> () at library/core/src/iter/adapters/map.rs:84
#4 core::iter::traits::iterator::Iterator::fold<core::ops::range::Range<isize>, (), core::iter::adapters::map::map_fold::{closure_env#0}<isize, std::ffi::os_str::OsString, (), std::sys::unix::args::imp::clone::{closure_env#0}, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<std::ffi::os_str::OsString, alloc::vec::spec_extend::{impl#1}::spec_extend::{closure_env#0}<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>, alloc::alloc::Global>>>>
() at library/core/src/iter/traits/iterator.rs:2414
#5 core::iter::adapters::map::{impl#2}::fold<std::ffi::os_str::OsString, core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}, (), core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<std::ffi::os_str::OsString, alloc::vec::spec_extend::{impl#1}::spec_extend::{closure_env#0}<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>, alloc::alloc::Global>>> () at library/core/src/iter/adapters/map.rs:124
#6 core::iter::traits::iterator::Iterator::for_each<core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>, alloc::vec::spec_extend::{impl#1}::spec_extend::{closure_env#0}<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>, alloc::alloc::Global>> () at library/core/src/iter/traits/iterator.rs:831
#7 alloc::vec::spec_extend::{impl#1}::spec_extend<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>, alloc::alloc::Global> () at library/alloc/src/vec/spec_extend.rs:40
#8 alloc::vec::spec_from_iter_nested::{impl#1}::from_iter<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>> () at library/alloc/src/vec/spec_from_iter_nested.rs:62
#9 alloc::vec::spec_from_iter::{impl#0}::from_iter<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>> () at library/alloc/src/vec/spec_from_iter.rs:33
#10 alloc::vec::{impl#18}::from_iter<std::ffi::os_str::OsString, core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>> ()
at library/alloc/src/vec/mod.rs:2757
#11 core::iter::traits::iterator::Iterator::collect<core::iter::adapters::map::Map<core::ops::range::Range<isize>, std::sys::unix::args::imp::clone::{closure_env#0}>, alloc::vec::Vec<std::ffi::os_str::OsString, alloc::alloc::Global>> () at library/core/src/iter/traits/iterator.rs:1836
#12 std::sys::unix::args::imp::clone () at library/std/src/sys/unix/args.rs:144
#13 std::sys::unix::args::imp::args () at library/std/src/sys/unix/args.rs:129
#14 std::sys::unix::args::args () at library/std/src/sys/unix/args.rs:19
#15 std::env::args_os () at library/std/src/env.rs:792
#16 0x00007ffff7713151 in std::env::args () at library/std/src/env.rs:757
[...]
Also
(gdb) info registers
rax 0x0 0
rbx 0x8 8
rcx 0x6b6e6973656b6166 7741240753840152934
rdx 0x8 8
rsi 0x6b6e6973656b6166 7741240753840152934
rdi 0x0 0
rbp 0x0 0x0
rsp 0x7fffffffc398 0x7fffffffc398
r8 0x7ffff7cacce0 140737350651104
r9 0x40 64
r10 0x0 0
r11 0x20 32
r12 0x55555556e890 93824992340112
r13 0x4 4
r14 0x5 5
r15 0x5555558748e0 93824995510496
rip 0x7ffff7c4093c 0x7ffff7c4093c <__strlen_evex+28>