|
1 |
| -% Raw Pointers |
2 |
| - |
3 |
| -Rust offers two additional pointer types (*raw pointers*), written as |
4 |
| -`*const T` and `*mut T`. They're an approximation of C's `const T*` and `T*` |
5 |
| -respectively; indeed, one of their most common uses is for FFI, |
6 |
| -interfacing with external C libraries. |
7 |
| - |
8 |
| -Raw pointers have much fewer guarantees than other pointer types |
9 |
| -offered by the Rust language and libraries. For example, they |
10 |
| - |
11 |
| -- are not guaranteed to point to valid memory and are not even |
12 |
| - guaranteed to be non-null (unlike both `Box` and `&`); |
13 |
| -- do not have any automatic clean-up, unlike `Box`, and so require |
14 |
| - manual resource management; |
15 |
| -- are plain-old-data, that is, they don't move ownership, again unlike |
16 |
| - `Box`, hence the Rust compiler cannot protect against bugs like |
17 |
| - use-after-free; |
18 |
| -- lack any form of lifetimes, unlike `&`, and so the compiler cannot |
19 |
| - reason about dangling pointers; and |
20 |
| -- have no guarantees about aliasing or mutability other than mutation |
21 |
| - not being allowed directly through a `*const T`. |
22 |
| - |
23 |
| -Fortunately, they come with a redeeming feature: the weaker guarantees |
24 |
| -mean weaker restrictions. The missing restrictions make raw pointers |
25 |
| -appropriate as a building block for implementing things like smart |
26 |
| -pointers and vectors inside libraries. For example, `*` pointers are |
27 |
| -allowed to alias, allowing them to be used to write shared-ownership |
28 |
| -types like reference counted and garbage collected pointers, and even |
29 |
| -thread-safe shared memory types (`Rc` and the `Arc` types are both |
30 |
| -implemented entirely in Rust). |
31 |
| - |
32 |
| -There are two things that you are required to be careful about |
33 |
| -(i.e. require an `unsafe { ... }` block) with raw pointers: |
34 |
| - |
35 |
| -- dereferencing: they can have any value: so possible results include |
36 |
| - a crash, a read of uninitialised memory, a use-after-free, or |
37 |
| - reading data as normal. |
38 |
| -- pointer arithmetic via the `offset` [intrinsic](#intrinsics) (or |
39 |
| - `.offset` method): this intrinsic uses so-called "in-bounds" |
40 |
| - arithmetic, that is, it is only defined behaviour if the result is |
41 |
| - inside (or one-byte-past-the-end) of the object from which the |
42 |
| - original pointer came. |
43 |
| - |
44 |
| -The latter assumption allows the compiler to optimize more |
45 |
| -effectively. As can be seen, actually *creating* a raw pointer is not |
46 |
| -unsafe, and neither is converting to an integer. |
47 |
| - |
48 |
| -### References and raw pointers |
49 |
| - |
50 |
| -At runtime, a raw pointer `*` and a reference pointing to the same |
51 |
| -piece of data have an identical representation. In fact, an `&T` |
52 |
| -reference will implicitly coerce to an `*const T` raw pointer in safe code |
53 |
| -and similarly for the `mut` variants (both coercions can be performed |
54 |
| -explicitly with, respectively, `value as *const T` and `value as *mut T`). |
55 |
| - |
56 |
| -Going the opposite direction, from `*const` to a reference `&`, is not |
57 |
| -safe. A `&T` is always valid, and so, at a minimum, the raw pointer |
58 |
| -`*const T` has to point to a valid instance of type `T`. Furthermore, |
59 |
| -the resulting pointer must satisfy the aliasing and mutability laws of |
60 |
| -references. The compiler assumes these properties are true for any |
61 |
| -references, no matter how they are created, and so any conversion from |
62 |
| -raw pointers is asserting that they hold. The programmer *must* |
63 |
| -guarantee this. |
64 |
| - |
65 |
| -The recommended method for the conversion is |
| 1 | +% Unsafe Code |
66 | 2 |
|
| 3 | +Rust’s main draw is its powerful static guarantees about behavior. But safety |
| 4 | +checks are conservative by nature: there are some programs that are actually |
| 5 | +safe, but the compiler is not able to verify this is true. To write these kinds |
| 6 | +of programs, we need to tell the compiler to relax its restrictions a bit. For |
| 7 | +this, Rust has a keyword, `unsafe`. Code using `unsafe` has less restrictions |
| 8 | +than normal code does. |
| 9 | + |
| 10 | +Let’s go over the syntax, and then we’ll talk semantics. `unsafe` is used in |
| 11 | +two contexts. The first one is to mark a function as unsafe: |
| 12 | + |
| 13 | +```rust |
| 14 | +unsafe fn danger_will_robinson() { |
| 15 | + // scary stuff |
| 16 | +} |
67 | 17 | ```
|
68 |
| -let i: u32 = 1; |
69 |
| -// explicit cast |
70 |
| -let p_imm: *const u32 = &i as *const u32; |
71 |
| -let mut m: u32 = 2; |
72 |
| -// implicit coercion |
73 |
| -let p_mut: *mut u32 = &mut m; |
74 | 18 |
|
| 19 | +All functions called from [FFI][ffi] must be marked as `unsafe`, for example. |
| 20 | +The second use of `unsafe` is an unsafe block: |
| 21 | + |
| 22 | +[ffi]: ffi.html |
| 23 | + |
| 24 | +```rust |
75 | 25 | unsafe {
|
76 |
| - let ref_imm: &u32 = &*p_imm; |
77 |
| - let ref_mut: &mut u32 = &mut *p_mut; |
| 26 | + // scary stuff |
78 | 27 | }
|
79 | 28 | ```
|
80 | 29 |
|
81 |
| -The `&*x` dereferencing style is preferred to using a `transmute`. |
82 |
| -The latter is far more powerful than necessary, and the more |
83 |
| -restricted operation is harder to use incorrectly; for example, it |
84 |
| -requires that `x` is a pointer (unlike `transmute`). |
| 30 | +It’s important to be able to explicitly delineate code that may have bugs that |
| 31 | +cause big problems. If a Rust program segfaults, you can be sure it’s somewhere |
| 32 | +in the sections marked `unsafe`. |
| 33 | + |
| 34 | +# What does ‘safe’ mean? |
| 35 | + |
| 36 | +Safe, in the context of Rust, means “doesn’t do anything unsafe.” Easy! |
| 37 | + |
| 38 | +Okay, let’s try again: what is not safe to do? Here’s a list: |
| 39 | + |
| 40 | +* Data races |
| 41 | +* Dereferencing a null/dangling raw pointer |
| 42 | +* Reads of [undef][undef] (uninitialized) memory |
| 43 | +* Breaking the [pointer aliasing rules][aliasing] with raw pointers. |
| 44 | +* `&mut T` and `&T` follow LLVM’s scoped [noalias][noalias] model, except if |
| 45 | + the `&T` contains an `UnsafeCell<U>`. Unsafe code must not violate these |
| 46 | + aliasing guarantees. |
| 47 | +* Mutating an immutable value/reference without `UnsafeCell<U>` |
| 48 | +* Invoking undefined behavior via compiler intrinsics: |
| 49 | + * Indexing outside of the bounds of an object with `std::ptr::offset` |
| 50 | + (`offset` intrinsic), with |
| 51 | + the exception of one byte past the end which is permitted. |
| 52 | + * Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64` |
| 53 | + intrinsics) on overlapping buffers |
| 54 | +* Invalid values in primitive types, even in private fields/locals: |
| 55 | + * Null/dangling references or boxes |
| 56 | + * A value other than `false` (0) or `true` (1) in a `bool` |
| 57 | + * A discriminant in an `enum` not included in its type definition |
| 58 | + * A value in a `char` which is a surrogate or above `char::MAX` |
| 59 | + * Non-UTF-8 byte sequences in a `str` |
| 60 | +* Unwinding into Rust from foreign code or unwinding from Rust into foreign |
| 61 | + code. |
| 62 | + |
| 63 | +[noalias]: http://llvm.org/docs/LangRef.html#noalias |
| 64 | +[undef]: http://llvm.org/docs/LangRef.html#undefined-values |
| 65 | +[aliasing]: http://llvm.org/docs/LangRef.html#pointer-aliasing-rules |
| 66 | + |
| 67 | +Whew! That’s a bunch of stuff. It’s also important to notice all kinds of |
| 68 | +behaviors that are certainly bad, but are expressly _not_ unsafe: |
| 69 | + |
| 70 | +* Deadlocks |
| 71 | +* Reading data from private fields |
| 72 | +* Leaks due to reference count cycles |
| 73 | +* Exiting without calling destructors |
| 74 | +* Sending signals |
| 75 | +* Accessing/modifying the file system |
| 76 | +* Integer overflow |
| 77 | + |
| 78 | +Rust cannot prevent all kinds of software problems. Buggy code can and will be |
| 79 | +written in Rust. These things arne’t great, but they don’t qualify as `unsafe` |
| 80 | +specifically. |
| 81 | + |
| 82 | +# Unsafe Superpowers |
| 83 | + |
| 84 | +In both unsafe functions and unsafe blocks, Rust will let you do three things |
| 85 | +that you normally can not do. Just three. Here they are: |
| 86 | + |
| 87 | +1. Access or update a [static mutable variable][static]. |
| 88 | +2. Dereference a raw pointer. |
| 89 | +3. Call unsafe functions. This is the most powerful ability. |
| 90 | + |
| 91 | +That’s it. It’s important that `unsafe` does not, for example, ‘turn off the |
| 92 | +borrow checker’. Adding `unsafe` to some random Rust code doesn’t change its |
| 93 | +semantics, it won’t just start accepting anything. |
| 94 | + |
| 95 | +But it will let you write things that _do_ break some of the rules. Let’s go |
| 96 | +over these three abilities in order. |
| 97 | + |
| 98 | +## Access or update a `static mut` |
| 99 | + |
| 100 | +Rust has a feature called ‘`static mut`’ which allows for mutable global state. |
| 101 | +Doing so can cause a data race, and as such is inherently not safe. For more |
| 102 | +details, see the [static][static] section of the book. |
| 103 | + |
| 104 | +[static]: static.html |
| 105 | + |
| 106 | +## Dereference a raw pointer |
| 107 | + |
| 108 | +Raw pointers let you do arbitrary pointer arithmetic, and can cause a number of |
| 109 | +different memory safety and security issues. In some senses, the ability to |
| 110 | +dereference an arbitrary pointer is one of the most dangerous things you can |
| 111 | +do. For more on raw pointers, see [their section of the book][rawpointers]. |
| 112 | + |
| 113 | +[rawpointers]: raw-pointers.html |
| 114 | + |
| 115 | +## Call unsafe functions |
| 116 | + |
| 117 | +This last ability works with both aspects of `unsafe`: you can only call |
| 118 | +functions marked `unsafe` from inside an unsafe block. |
| 119 | + |
| 120 | +This ability is powerful and varied. Rust exposes some [compiler |
| 121 | +intrinsics][intrinsics] as unsafe functions, and some unsafe functions bypass |
| 122 | +safety checks, trading safety for speed. |
| 123 | + |
| 124 | +I’ll repeat again: even though you _can_ do arbitrary things in unsafe blocks |
| 125 | +and functions doesn’t mean you should. The compiler will act as though you’re |
| 126 | +upholding its invariants, so be careful! |
| 127 | + |
| 128 | +[intrinsics]: intrinsics.html |
0 commit comments