Skip to content

Commit 9ddcc38

Browse files
committed
Refocus unsafe code chapter on unsafe itself.
1 parent 550c8d8 commit 9ddcc38

File tree

1 file changed

+121
-77
lines changed

1 file changed

+121
-77
lines changed

src/doc/trpl/unsafe-code.md

+121-77
Original file line numberDiff line numberDiff line change
@@ -1,84 +1,128 @@
1-
% Raw Pointers
2-
3-
Rust offers two additional pointer types (*raw pointers*), written as
4-
`*const T` and `*mut T`. They're an approximation of C's `const T*` and `T*`
5-
respectively; indeed, one of their most common uses is for FFI,
6-
interfacing with external C libraries.
7-
8-
Raw pointers have much fewer guarantees than other pointer types
9-
offered by the Rust language and libraries. For example, they
10-
11-
- are not guaranteed to point to valid memory and are not even
12-
guaranteed to be non-null (unlike both `Box` and `&`);
13-
- do not have any automatic clean-up, unlike `Box`, and so require
14-
manual resource management;
15-
- are plain-old-data, that is, they don't move ownership, again unlike
16-
`Box`, hence the Rust compiler cannot protect against bugs like
17-
use-after-free;
18-
- lack any form of lifetimes, unlike `&`, and so the compiler cannot
19-
reason about dangling pointers; and
20-
- have no guarantees about aliasing or mutability other than mutation
21-
not being allowed directly through a `*const T`.
22-
23-
Fortunately, they come with a redeeming feature: the weaker guarantees
24-
mean weaker restrictions. The missing restrictions make raw pointers
25-
appropriate as a building block for implementing things like smart
26-
pointers and vectors inside libraries. For example, `*` pointers are
27-
allowed to alias, allowing them to be used to write shared-ownership
28-
types like reference counted and garbage collected pointers, and even
29-
thread-safe shared memory types (`Rc` and the `Arc` types are both
30-
implemented entirely in Rust).
31-
32-
There are two things that you are required to be careful about
33-
(i.e. require an `unsafe { ... }` block) with raw pointers:
34-
35-
- dereferencing: they can have any value: so possible results include
36-
a crash, a read of uninitialised memory, a use-after-free, or
37-
reading data as normal.
38-
- pointer arithmetic via the `offset` [intrinsic](#intrinsics) (or
39-
`.offset` method): this intrinsic uses so-called "in-bounds"
40-
arithmetic, that is, it is only defined behaviour if the result is
41-
inside (or one-byte-past-the-end) of the object from which the
42-
original pointer came.
43-
44-
The latter assumption allows the compiler to optimize more
45-
effectively. As can be seen, actually *creating* a raw pointer is not
46-
unsafe, and neither is converting to an integer.
47-
48-
### References and raw pointers
49-
50-
At runtime, a raw pointer `*` and a reference pointing to the same
51-
piece of data have an identical representation. In fact, an `&T`
52-
reference will implicitly coerce to an `*const T` raw pointer in safe code
53-
and similarly for the `mut` variants (both coercions can be performed
54-
explicitly with, respectively, `value as *const T` and `value as *mut T`).
55-
56-
Going the opposite direction, from `*const` to a reference `&`, is not
57-
safe. A `&T` is always valid, and so, at a minimum, the raw pointer
58-
`*const T` has to point to a valid instance of type `T`. Furthermore,
59-
the resulting pointer must satisfy the aliasing and mutability laws of
60-
references. The compiler assumes these properties are true for any
61-
references, no matter how they are created, and so any conversion from
62-
raw pointers is asserting that they hold. The programmer *must*
63-
guarantee this.
64-
65-
The recommended method for the conversion is
1+
% Unsafe Code
662

3+
Rust’s main draw is its powerful static guarantees about behavior. But safety
4+
checks are conservative by nature: there are some programs that are actually
5+
safe, but the compiler is not able to verify this is true. To write these kinds
6+
of programs, we need to tell the compiler to relax its restrictions a bit. For
7+
this, Rust has a keyword, `unsafe`. Code using `unsafe` has less restrictions
8+
than normal code does.
9+
10+
Let’s go over the syntax, and then we’ll talk semantics. `unsafe` is used in
11+
two contexts. The first one is to mark a function as unsafe:
12+
13+
```rust
14+
unsafe fn danger_will_robinson() {
15+
// scary stuff
16+
}
6717
```
68-
let i: u32 = 1;
69-
// explicit cast
70-
let p_imm: *const u32 = &i as *const u32;
71-
let mut m: u32 = 2;
72-
// implicit coercion
73-
let p_mut: *mut u32 = &mut m;
7418

19+
All functions called from [FFI][ffi] must be marked as `unsafe`, for example.
20+
The second use of `unsafe` is an unsafe block:
21+
22+
[ffi]: ffi.html
23+
24+
```rust
7525
unsafe {
76-
let ref_imm: &u32 = &*p_imm;
77-
let ref_mut: &mut u32 = &mut *p_mut;
26+
// scary stuff
7827
}
7928
```
8029

81-
The `&*x` dereferencing style is preferred to using a `transmute`.
82-
The latter is far more powerful than necessary, and the more
83-
restricted operation is harder to use incorrectly; for example, it
84-
requires that `x` is a pointer (unlike `transmute`).
30+
It’s important to be able to explicitly delineate code that may have bugs that
31+
cause big problems. If a Rust program segfaults, you can be sure it’s somewhere
32+
in the sections marked `unsafe`.
33+
34+
# What does ‘safe’ mean?
35+
36+
Safe, in the context of Rust, means “doesn’t do anything unsafe.” Easy!
37+
38+
Okay, let’s try again: what is not safe to do? Here’s a list:
39+
40+
* Data races
41+
* Dereferencing a null/dangling raw pointer
42+
* Reads of [undef][undef] (uninitialized) memory
43+
* Breaking the [pointer aliasing rules][aliasing] with raw pointers.
44+
* `&mut T` and `&T` follow LLVM’s scoped [noalias][noalias] model, except if
45+
the `&T` contains an `UnsafeCell<U>`. Unsafe code must not violate these
46+
aliasing guarantees.
47+
* Mutating an immutable value/reference without `UnsafeCell<U>`
48+
* Invoking undefined behavior via compiler intrinsics:
49+
* Indexing outside of the bounds of an object with `std::ptr::offset`
50+
(`offset` intrinsic), with
51+
the exception of one byte past the end which is permitted.
52+
* Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64`
53+
intrinsics) on overlapping buffers
54+
* Invalid values in primitive types, even in private fields/locals:
55+
* Null/dangling references or boxes
56+
* A value other than `false` (0) or `true` (1) in a `bool`
57+
* A discriminant in an `enum` not included in its type definition
58+
* A value in a `char` which is a surrogate or above `char::MAX`
59+
* Non-UTF-8 byte sequences in a `str`
60+
* Unwinding into Rust from foreign code or unwinding from Rust into foreign
61+
code.
62+
63+
[noalias]: http://llvm.org/docs/LangRef.html#noalias
64+
[undef]: http://llvm.org/docs/LangRef.html#undefined-values
65+
[aliasing]: http://llvm.org/docs/LangRef.html#pointer-aliasing-rules
66+
67+
Whew! That’s a bunch of stuff. It’s also important to notice all kinds of
68+
behaviors that are certainly bad, but are expressly _not_ unsafe:
69+
70+
* Deadlocks
71+
* Reading data from private fields
72+
* Leaks due to reference count cycles
73+
* Exiting without calling destructors
74+
* Sending signals
75+
* Accessing/modifying the file system
76+
* Integer overflow
77+
78+
Rust cannot prevent all kinds of software problems. Buggy code can and will be
79+
written in Rust. These things arne’t great, but they don’t qualify as `unsafe`
80+
specifically.
81+
82+
# Unsafe Superpowers
83+
84+
In both unsafe functions and unsafe blocks, Rust will let you do three things
85+
that you normally can not do. Just three. Here they are:
86+
87+
1. Access or update a [static mutable variable][static].
88+
2. Dereference a raw pointer.
89+
3. Call unsafe functions. This is the most powerful ability.
90+
91+
That’s it. It’s important that `unsafe` does not, for example, ‘turn off the
92+
borrow checker’. Adding `unsafe` to some random Rust code doesn’t change its
93+
semantics, it won’t just start accepting anything.
94+
95+
But it will let you write things that _do_ break some of the rules. Let’s go
96+
over these three abilities in order.
97+
98+
## Access or update a `static mut`
99+
100+
Rust has a feature called ‘`static mut`’ which allows for mutable global state.
101+
Doing so can cause a data race, and as such is inherently not safe. For more
102+
details, see the [static][static] section of the book.
103+
104+
[static]: static.html
105+
106+
## Dereference a raw pointer
107+
108+
Raw pointers let you do arbitrary pointer arithmetic, and can cause a number of
109+
different memory safety and security issues. In some senses, the ability to
110+
dereference an arbitrary pointer is one of the most dangerous things you can
111+
do. For more on raw pointers, see [their section of the book][rawpointers].
112+
113+
[rawpointers]: raw-pointers.html
114+
115+
## Call unsafe functions
116+
117+
This last ability works with both aspects of `unsafe`: you can only call
118+
functions marked `unsafe` from inside an unsafe block.
119+
120+
This ability is powerful and varied. Rust exposes some [compiler
121+
intrinsics][intrinsics] as unsafe functions, and some unsafe functions bypass
122+
safety checks, trading safety for speed.
123+
124+
I’ll repeat again: even though you _can_ do arbitrary things in unsafe blocks
125+
and functions doesn’t mean you should. The compiler will act as though you’re
126+
upholding its invariants, so be careful!
127+
128+
[intrinsics]: intrinsics.html

0 commit comments

Comments
 (0)