SIGSEGV and SIGBUS cause an exit signal of SIGILL

[As demonstrated on Playpen](http://is.gd/QRIaZ5):

``` rust
fn main() {
    unsafe {*(0 as *mut u32) = 0};
}
```

```
playpen: application terminated abnormally with signal 4 (Illegal instruction)
```

The reason for this is that, when a UNIX signal is being handled, UNIX blocks that signal while the handler is running, by default. However, [this particular signal handler](https://github.com/rust-lang/rust/blob/master/src/libstd/sys/unix/stack_overflow.rs#L66) tries to re-raise the signal when it's done with processing. This doesn't do anything since the handler is still running, so execution continues with the next line, `intrinsics::abort()`, which induces `SIGILL`. (The code tries to reset the _signal disposition_ to avoid re-entering the same handler, but that's not enough; the _mask of blocked signals_ also needs to be cleared.)

There are a couple of possible solutions to this. The straightforward one is to cause the signal not to be masked. I have a [patch for this](https://github.com/geofft/rust/commit/unmask-sigsegv) that depends on my signal-FFI-bindings refactoring in #25784; I can submit it as a PR once that lands.

Another one, which I prefer, is to just let the handler terminate instead of re-raising. There was [discussion about this previously](https://github.com/rust-lang/rust/pull/16388#discussion_r19097299), where it was noted that glibc's manual says this is undefined for "program error signals" like `SIGSEGV` and `SIGBUS`. POSIX agrees with that. However, most platforms in practice define this behavior, and allow you to return from these handlers, so that you can do things like userspace page-fault handling.

For example,
- Google's Breakpad crash-handling library returns from `SIGSEGV` [on Linux](https://code.google.com/p/google-breakpad/source/browse/trunk/src/client/linux/handler/exception_handler.cc#317) and [on Solaris](https://code.google.com/p/google-breakpad/source/browse/trunk/src/client/solaris/handler/exception_handler.cc#168). (On Darwin they use Mach exceptions.)
- The Oracle JVM [relies on being able to return from a `SIGSEGV` handler](http://www.oracle.com/technetwork/java/javase/signals-139944.html): they use it as a trick for stop-the-world pauses with low overhead. (Periodically each thread will do a single read from a special page, which is ~one instruction. If the GC wants to stop the world, it'll unmap the page to cause each thread to fault, and once it's ready to continue the world, it'll remap the page and make all threads return from their signal handlers.) So any UNIX platform where the JVM works should support this.
- [GNU libsigsegv](http://www.gnu.org/software/libsigsegv/) is a library for doing things like userspace paging, and it's predicated on the assumption that returning from `SIGSEGV` is possible. The [PORTING file](http://git.savannah.gnu.org/cgit/libsigsegv.git/tree/PORTING) has a wide list of supported platforms, including Linux, Darwin, FreeBSD, OpenBSD, NetBSD, Solaris, and MinGW.

So I think that it's merely the case that returning from a program error handler is unspecified in POSIX, but just about all actual OSes we care about support it.

This has the advantage that the exact signal is re-delivered to kill the program, with the right siginfo (indicating it died because of a memory error, not because someone manually sent `SIGSEGV`), so the last frame in a coredump is right, `dmesg` prints a line, etc. We can keep the re-raise for unknown platforms, but for platforms where we know that returning from the handler works, that seems both simpler and better. (Breakpad takes this approach for the same reasons.)

The final option is just to remove this code. What it does, as far as I see, is to print an error message and die if the segfault was on the guard page, and just die otherwise. I don't think there's a compelling reason to print a message for stack overflow, especially as caught by `SIGSEGV` (it makes more sense if it's caught by stack probes or `morestack`). In any other systems language, overflowing the stack just gets you killed with `SIGSEGV`, and installing a special handler doesn't seem to match the runtime-removal philosophy. It made sense in the `librustrt` world, but I'd argue it's not useful now. But if other people are finding the handler / the error message useful, that's fine.

Cc @Zoxc for advice, as the original author of the segfault handling code.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIGSEGV and SIGBUS cause an exit signal of SIGILL #26458

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SIGSEGV and SIGBUS cause an exit signal of SIGILL #26458

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions