Skip to content

Commit d420f62

Browse files
committed
glossary: sort alphabetically
1 parent 148e24e commit d420f62

File tree

1 file changed

+111
-111
lines changed

1 file changed

+111
-111
lines changed

reference/src/glossary.md

+111-111
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,77 @@ somewhat differently from this definition. However, that's considered a low
5555
level detail of a particular Rust implementation. When programming Rust, the
5656
Abstract Rust Machine is intended to operate according to the definition here.
5757

58-
### (Pointer) Provenance
58+
### Interior mutability
59+
60+
*Interior Mutation* means mutating memory where there also exists a live shared reference pointing to the same memory; or mutating memory through a pointer derived from a shared reference.
61+
"live" here means a value that will be "used again" later.
62+
"derived from" means that the pointer was obtained by casting a shared reference and potentially adding an offset.
63+
This is not yet precisely defined, which will be fixed as part of developing a precise aliasing model.
64+
65+
Finding live shared references propagates recursively through references, but not through raw pointers.
66+
So, for example, if data immediately pointed to by a `&T` or `& &mut T` is mutated, that's interior mutability.
67+
If data immediately pointed to by a `*const T` or `&*const T` is mutated, that's *not* interior mutability.
68+
69+
*Interior mutability* refers to the ability to perform interior mutation without causing UB.
70+
All interior mutation in Rust has to happen inside an [`UnsafeCell`](https://doc.rust-lang.org/core/cell/struct.UnsafeCell.html), so all data structures that have interior mutability must (directly or indirectly) use `UnsafeCell` for this purpose.
71+
72+
### Layout
73+
[layout]: #layout
74+
75+
The *layout* of a type defines its size and alignment as well as the offsets of its subobjects (e.g. fields of structs/unions/enum/... or elements of arrays).
76+
Moreover, the layout of a type records its *function call ABI* (or just *ABI* for short): how the type is passed *by value* across a function boundary.
77+
78+
Note: Originally, *layout* and *representation* were treated as synonyms, and Rust language features like the `#[repr]` attribute reflect this.
79+
In this document, *layout* and *representation* are not synonyms.
80+
81+
### Niche
82+
83+
The *niche* of a type determines invalid bit-patterns that will be used by layout optimizations.
84+
85+
For example, `&mut T` has at least one niche, the "all zeros" bit-pattern. This
86+
niche is used by layout optimizations like ["`enum` discriminant
87+
elision"](layout/enums.html#discriminant-elision-on-option-like-enums) to
88+
guarantee that `Option<&mut T>` has the same size as `&mut T`.
89+
90+
While all niches are invalid bit-patterns, not all invalid bit-patterns are
91+
niches. For example, the "all bits uninitialized" is an invalid bit-pattern for
92+
`&mut T`, but this bit-pattern cannot be used by layout optimizations, and is not a
93+
niche.
94+
95+
### Padding
96+
[padding]: #padding
97+
98+
*Padding* (of a type `T`) refers to the space that the compiler leaves between fields of a struct or enum variant to satisfy alignment requirements, and before/after variants of a union or enum to make all variants equally sized.
99+
100+
Padding can be thought of as the type containing secret fields of type `[Pad; N]` for some hypothetical type `Pad` (of size 1) with the following properties:
101+
* `Pad` is valid for any byte, i.e., it has the same validity invariant as `MaybeUninit<u8>`.
102+
* Copying `Pad` ignores the source byte, and writes *any* value to the target byte. Or, equivalently (in terms of Abstract Machine behavior), copying `Pad` marks the target byte as uninitialized.
103+
104+
Note that padding is a property of the *type* and not the memory: reading from the padding of an `&Foo` (by casting to a byte reference) may produce initialized values if the `&Foo` is pointing to memory that was initialized (for example, if it was originally a byte buffer initialized to `0`), but the moment you perform a typed copy out of that reference you will have uninitialized padding bytes in the copy.
105+
106+
107+
We can also define padding in terms of the [representation relation]:
108+
A byte at index `i` is a padding byte for type `T` if,
109+
for all values `v` and lists of bytes `b` such that `v` and `b` are related at `T` (let's write this `Vrel_T(v, b)`),
110+
changing `b` at index `i` to any other byte yields a `b'` such `v` and `b'` are related (`Vrel_T(v, b')`).
111+
In other words, the byte at index `i` is entirely ignored by `Vrel_T` (the value relation for `T`), and two lists of bytes that only differ in padding bytes relate to the same value(s), if any.
112+
113+
This definition works fine for product types (structs, tuples, arrays, ...).
114+
The desired notion of "padding byte" for enums and unions is still unclear.
115+
116+
### Place
117+
118+
A *place* (called "lvalue" in C and "glvalue" in C++) is the result of computing a [*place expression*][place-value-expr].
119+
A place is basically a pointer (pointing to some location in memory, potentially carrying [provenance](#pointer-provenance)), but might contain more information such as size or alignment (the details will have to be determined as the Rust Abstract Machine gets specified more precisely).
120+
A place has a type, indicating the type of [values](#value) that it stores.
121+
122+
The key operations on a place are:
123+
* Storing a [value](#value) of the same type in it (when it is used on the left-hand side of an assignment).
124+
* Loading a [value](#value) of the same type from it (through the place-to-value coercion).
125+
* Converting between a place (of type `T`) and a pointer value (of type `&T`, `&mut T`, `*const T` or `*mut T`) using the `&` and `*` operators.
126+
This is also the only way a place can be "stored": by converting it to a value first.
127+
128+
### Pointer Provenance
59129

60130
The *provenance* of a pointer is used to distinguish pointers that point to the same memory address (i.e., pointers that, when cast to `usize`, will compare equal).
61131
Provenance is extra state that only exists in the Rust Abstract Machine; it is needed to specify program behavior but not present any more when the program runs on real hardware.
@@ -95,19 +165,43 @@ For some more information, see [this document proposing a more precise definitio
95165
Another example of pointer provenance is the "tag" from [Stacked Borrows][stacked-borrows].
96166
For some more information, see [this blog post](https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html).
97167

98-
### Interior mutability
168+
### Representation (relation)
169+
[representation relation]: #representation-relation
99170

100-
*Interior Mutation* means mutating memory where there also exists a live shared reference pointing to the same memory; or mutating memory through a pointer derived from a shared reference.
101-
"live" here means a value that will be "used again" later.
102-
"derived from" means that the pointer was obtained by casting a shared reference and potentially adding an offset.
103-
This is not yet precisely defined, which will be fixed as part of developing a precise aliasing model.
171+
A *representation* of a [value](#value) is a list of bytes that is used to store or "represent" that value in memory.
104172

105-
Finding live shared references propagates recursively through references, but not through raw pointers.
106-
So, for example, if data immediately pointed to by a `&T` or `& &mut T` is mutated, that's interior mutability.
107-
If data immediately pointed to by a `*const T` or `&*const T` is mutated, that's *not* interior mutability.
173+
We also sometimes speak of the *representation of a type*; this should more correctly be called the *representation relation* as it relates values of this type to lists of bytes that represent this value.
174+
The term "relation" here is used in the mathematical sense: the representation relation is a predicate that, given a value and a list of bytes, says whether this value is represented by that list of bytes (`val -> list byte -> Prop`).
108175

109-
*Interior mutability* refers to the ability to perform interior mutation without causing UB.
110-
All interior mutation in Rust has to happen inside an [`UnsafeCell`](https://doc.rust-lang.org/core/cell/struct.UnsafeCell.html), so all data structures that have interior mutability must (directly or indirectly) use `UnsafeCell` for this purpose.
176+
The relation should be functional for a fixed list of bytes (i.e., every list of bytes has at most one associated representation).
177+
It is partial in both directions: not all values have a representation (e.g. the mathematical integer `300` has no representation at type `u8`), and not all lists of bytes correspond to a value of a specific type (e.g. lists of the wrong size correspond to no value, and the list consisting of the single byte `0x10` corresponds to no value of type `bool`).
178+
For a fixed value, there can be many representations (e.g., when considering type `#[repr(C)] Pair(u8, u16)`, the second byte is a [padding byte][padding] so changing it does not affect the value represented by a list of bytes).
179+
180+
See the [value domain][value-domain] for an example how values and representation relations can be made more precise.
181+
182+
### Soundness (of code / of a library)
183+
[soundness]: #soundness-of-code--of-a-library
184+
185+
*Soundness* is a type system concept (actually originating from the study of logics) and means that the type system is "correct" in the sense that well-typed programs actually have the desired properties.
186+
For Rust, this means well-typed programs cannot cause [Undefined Behavior][ub].
187+
This promise only extends to safe code however; for `unsafe` code, it is up to the programmer to uphold this contract.
188+
189+
Accordingly, we say that a library (or an individual function) is *sound* if it is impossible for safe code to cause Undefined Behavior using its public API.
190+
Conversely, the library/function is *unsound* if safe code *can* cause Undefined Behavior.
191+
192+
### Undefined Behavior
193+
[ub]: #undefined-behavior
194+
195+
*Undefined Behavior* is a concept of the contract between the Rust programmer and the compiler:
196+
The programmer promises that the code exhibits no undefined behavior.
197+
In return, the compiler promises to compile the code in a way that the final program does on the real hardware what the source program does according to the Rust Abstract Machine.
198+
If it turns out the program *does* have undefined behavior, the contract is void, and the program produced by the compiler is essentially garbage (in particular, it is not bound by any specification; the program does not even have to be well-formed executable code).
199+
200+
In Rust, the [Nomicon](https://doc.rust-lang.org/nomicon/what-unsafe-does.html) and the [Reference](https://doc.rust-lang.org/reference/behavior-considered-undefined.html) both have a list of behavior that the language considers undefined.
201+
Rust promises that safe code cannot cause Undefined Behavior---the compiler and authors of unsafe code takes the burden of this contract on themselves.
202+
For unsafe code, however, the burden is still on the programmer.
203+
204+
Also see: [Soundness][soundness].
111205

112206
### Validity and safety invariant
113207

@@ -146,95 +240,6 @@ Moreover, such unsafe code must not return a non-UTF-8 string to the "outside" o
146240
To summarize: *Data must always be valid, but it only must be safe in safe code.*
147241
For some more information, see [this blog post](https://www.ralfj.de/blog/2018/08/22/two-kinds-of-invariants.html).
148242

149-
### Undefined Behavior
150-
[ub]: #undefined-behavior
151-
152-
*Undefined Behavior* is a concept of the contract between the Rust programmer and the compiler:
153-
The programmer promises that the code exhibits no undefined behavior.
154-
In return, the compiler promises to compile the code in a way that the final program does on the real hardware what the source program does according to the Rust Abstract Machine.
155-
If it turns out the program *does* have undefined behavior, the contract is void, and the program produced by the compiler is essentially garbage (in particular, it is not bound by any specification; the program does not even have to be well-formed executable code).
156-
157-
In Rust, the [Nomicon](https://doc.rust-lang.org/nomicon/what-unsafe-does.html) and the [Reference](https://doc.rust-lang.org/reference/behavior-considered-undefined.html) both have a list of behavior that the language considers undefined.
158-
Rust promises that safe code cannot cause Undefined Behavior---the compiler and authors of unsafe code takes the burden of this contract on themselves.
159-
For unsafe code, however, the burden is still on the programmer.
160-
161-
Also see: [Soundness][soundness].
162-
163-
### Soundness (of code / of a library)
164-
[soundness]: #soundness-of-code--of-a-library
165-
166-
*Soundness* is a type system concept (actually originating from the study of logics) and means that the type system is "correct" in the sense that well-typed programs actually have the desired properties.
167-
For Rust, this means well-typed programs cannot cause [Undefined Behavior][ub].
168-
This promise only extends to safe code however; for `unsafe` code, it is up to the programmer to uphold this contract.
169-
170-
Accordingly, we say that a library (or an individual function) is *sound* if it is impossible for safe code to cause Undefined Behavior using its public API.
171-
Conversely, the library/function is *unsound* if safe code *can* cause Undefined Behavior.
172-
173-
### Layout
174-
[layout]: #layout
175-
176-
The *layout* of a type defines its size and alignment as well as the offsets of its subobjects (e.g. fields of structs/unions/enum/... or elements of arrays).
177-
Moreover, the layout of a type records its *function call ABI* (or just *ABI* for short): how the type is passed *by value* across a function boundary.
178-
179-
Note: Originally, *layout* and *representation* were treated as synonyms, and Rust language features like the `#[repr]` attribute reflect this.
180-
In this document, *layout* and *representation* are not synonyms.
181-
182-
### Niche
183-
184-
The *niche* of a type determines invalid bit-patterns that will be used by layout optimizations.
185-
186-
For example, `&mut T` has at least one niche, the "all zeros" bit-pattern. This
187-
niche is used by layout optimizations like ["`enum` discriminant
188-
elision"](layout/enums.html#discriminant-elision-on-option-like-enums) to
189-
guarantee that `Option<&mut T>` has the same size as `&mut T`.
190-
191-
While all niches are invalid bit-patterns, not all invalid bit-patterns are
192-
niches. For example, the "all bits uninitialized" is an invalid bit-pattern for
193-
`&mut T`, but this bit-pattern cannot be used by layout optimizations, and is not a
194-
niche.
195-
196-
### Zero-sized type / ZST
197-
198-
Types with zero size are called zero-sized types, which is abbreviated as "ZST".
199-
This document also uses the "1-ZST" abbreviation, which stands for "one-aligned
200-
zero-sized type", to refer to zero-sized types with an alignment requirement of 1.
201-
202-
For example, `()` is a "1-ZST" but `[u16; 0]` is not because it has an alignment
203-
requirement of 2.
204-
205-
### Padding
206-
[padding]: #padding
207-
208-
*Padding* (of a type `T`) refers to the space that the compiler leaves between fields of a struct or enum variant to satisfy alignment requirements, and before/after variants of a union or enum to make all variants equally sized.
209-
210-
Padding can be thought of as the type containing secret fields of type `[Pad; N]` for some hypothetical type `Pad` (of size 1) with the following properties:
211-
* `Pad` is valid for any byte, i.e., it has the same validity invariant as `MaybeUninit<u8>`.
212-
* Copying `Pad` ignores the source byte, and writes *any* value to the target byte. Or, equivalently (in terms of Abstract Machine behavior), copying `Pad` marks the target byte as uninitialized.
213-
214-
Note that padding is a property of the *type* and not the memory: reading from the padding of an `&Foo` (by casting to a byte reference) may produce initialized values if the `&Foo` is pointing to memory that was initialized (for example, if it was originally a byte buffer initialized to `0`), but the moment you perform a typed copy out of that reference you will have uninitialized padding bytes in the copy.
215-
216-
217-
We can also define padding in terms of the [representation relation]:
218-
A byte at index `i` is a padding byte for type `T` if,
219-
for all values `v` and lists of bytes `b` such that `v` and `b` are related at `T` (let's write this `Vrel_T(v, b)`),
220-
changing `b` at index `i` to any other byte yields a `b'` such `v` and `b'` are related (`Vrel_T(v, b')`).
221-
In other words, the byte at index `i` is entirely ignored by `Vrel_T` (the value relation for `T`), and two lists of bytes that only differ in padding bytes relate to the same value(s), if any.
222-
223-
This definition works fine for product types (structs, tuples, arrays, ...).
224-
The desired notion of "padding byte" for enums and unions is still unclear.
225-
226-
### Place
227-
228-
A *place* (called "lvalue" in C and "glvalue" in C++) is the result of computing a [*place expression*][place-value-expr].
229-
A place is basically a pointer (pointing to some location in memory, potentially carrying [provenance](#pointer-provenance)), but might contain more information such as size or alignment (the details will have to be determined as the Rust Abstract Machine gets specified more precisely).
230-
A place has a type, indicating the type of [values](#value) that it stores.
231-
232-
The key operations on a place are:
233-
* Storing a [value](#value) of the same type in it (when it is used on the left-hand side of an assignment).
234-
* Loading a [value](#value) of the same type from it (through the place-to-value coercion).
235-
* Converting between a place (of type `T`) and a pointer value (of type `&T`, `&mut T`, `*const T` or `*mut T`) using the `&` and `*` operators.
236-
This is also the only way a place can be "stored": by converting it to a value first.
237-
238243
### Value
239244

240245
A *value* (called "value of the expression" or "rvalue" in C and "prvalue" in C++) is what gets stored in a [place](#place), and also the result of computing a [*value expression*][place-value-expr].
@@ -245,19 +250,14 @@ Values can be (according to their type) turned into a list of bytes, which is ca
245250
Values are ephemeral; they arise during the computation of an instruction but are only ever persisted in memory through their representation.
246251
(This is comparable to how run-time data in a program is ephemeral and is only ever persisted in serialized form.)
247252

248-
### Representation (relation)
249-
[representation relation]: #representation-relation
250-
251-
A *representation* of a [value](#value) is a list of bytes that is used to store or "represent" that value in memory.
252-
253-
We also sometimes speak of the *representation of a type*; this should more correctly be called the *representation relation* as it relates values of this type to lists of bytes that represent this value.
254-
The term "relation" here is used in the mathematical sense: the representation relation is a predicate that, given a value and a list of bytes, says whether this value is represented by that list of bytes (`val -> list byte -> Prop`).
253+
### Zero-sized type / ZST
255254

256-
The relation should be functional for a fixed list of bytes (i.e., every list of bytes has at most one associated representation).
257-
It is partial in both directions: not all values have a representation (e.g. the mathematical integer `300` has no representation at type `u8`), and not all lists of bytes correspond to a value of a specific type (e.g. lists of the wrong size correspond to no value, and the list consisting of the single byte `0x10` corresponds to no value of type `bool`).
258-
For a fixed value, there can be many representations (e.g., when considering type `#[repr(C)] Pair(u8, u16)`, the second byte is a [padding byte][padding] so changing it does not affect the value represented by a list of bytes).
255+
Types with zero size are called zero-sized types, which is abbreviated as "ZST".
256+
This document also uses the "1-ZST" abbreviation, which stands for "one-aligned
257+
zero-sized type", to refer to zero-sized types with an alignment requirement of 1.
259258

260-
See the [value domain][value-domain] for an example how values and representation relations can be made more precise.
259+
For example, `()` is a "1-ZST" but `[u16; 0]` is not because it has an alignment
260+
requirement of 2.
261261

262262
[stacked-borrows]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md
263263
[value-domain]: https://github.com/rust-lang/unsafe-code-guidelines/tree/master/wip/value-domain.md

0 commit comments

Comments
 (0)