-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Add documentation on trait objects. #22106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,286 @@ | ||
% Static and Dynamic Dispatch | ||
|
||
When code involves polymorphism, there needs to be a mechanism to determine | ||
which specific version is actually run. This is called 'dispatch.' There are | ||
two major forms of dispatch: static dispatch and dynamic dispatch. While Rust | ||
favors static dispatch, it also supports dynamic dispatch through a mechanism | ||
called 'trait objects.' | ||
|
||
## Background | ||
|
||
For the rest of this chapter, we'll need a trait and some implementations. | ||
Let's make a simple one, `Foo`. It has one method that is expected to return a | ||
`String`. | ||
|
||
```rust | ||
trait Foo { | ||
fn method(&self) -> String; | ||
} | ||
``` | ||
|
||
We'll also implement this trait for `u8` and `String`: | ||
|
||
```rust | ||
# trait Foo { fn method(&self) -> String; } | ||
impl Foo for u8 { | ||
fn method(&self) -> String { format!("u8: {}", *self) } | ||
} | ||
|
||
impl Foo for String { | ||
fn method(&self) -> String { format!("string: {}", *self) } | ||
} | ||
``` | ||
|
||
|
||
## Static dispatch | ||
|
||
We can use this trait to perform static dispatch with trait bounds: | ||
|
||
```rust | ||
# trait Foo { fn method(&self) -> String; } | ||
# impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } } | ||
# impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } } | ||
fn do_something<T: Foo>(x: T) { | ||
x.method(); | ||
} | ||
|
||
fn main() { | ||
let x = 5u8; | ||
let y = "Hello".to_string(); | ||
|
||
do_something(x); | ||
do_something(y); | ||
} | ||
``` | ||
|
||
Rust uses 'monomorphization' to perform static dispatch here. This means that | ||
Rust will create a special version of `do_something()` for both `u8` and | ||
`String`, and then replace the call sites with calls to these specialized | ||
functions. In other words, Rust generates something like this: | ||
|
||
```rust | ||
# trait Foo { fn method(&self) -> String; } | ||
# impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } } | ||
# impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } } | ||
fn do_something_u8(x: u8) { | ||
x.method(); | ||
} | ||
|
||
fn do_something_string(x: String) { | ||
x.method(); | ||
} | ||
|
||
fn main() { | ||
let x = 5u8; | ||
let y = "Hello".to_string(); | ||
|
||
do_something_u8(x); | ||
do_something_string(y); | ||
} | ||
``` | ||
|
||
This has some upsides: static dispatching of any method calls, allowing for | ||
inlining and hence usually higher performance. It also has some downsides: | ||
causing code bloat due to many copies of the same function existing in the | ||
binary, one for each type. | ||
|
||
Furthermore, compilers aren’t perfect and may “optimise” code to become slower. | ||
For example, functions inlined too eagerly will bloat the instruction cache | ||
(cache rules everything around us). This is part of the reason that `#[inline]` | ||
and `#[inline(always)]` should be used carefully, and one reason why using a | ||
dynamic dispatch is sometimes more efficient. | ||
|
||
However, the common case is that it is more efficient to use static dispatch, | ||
and one can always have a thin statically-dispatched wrapper function that does | ||
a dynamic, but not vice versa, meaning static calls are more flexible. The | ||
standard library tries to be statically dispatched where possible for this | ||
reason. | ||
|
||
## Dynamic dispatch | ||
|
||
Rust provides dynamic dispatch through a feature called 'trait objects.' Trait | ||
objects, like `&Foo` or `Box<Foo>`, are normal values that store a value of | ||
*any* type that implements the given trait, where the precise type can only be | ||
known at runtime. The methods of the trait can be called on a trait object via | ||
a special record of function pointers (created and managed by the compiler). | ||
|
||
A function that takes a trait object is not specialised to each of the types | ||
that implements `Foo`: only one copy is generated, often (but not always) | ||
resulting in less code bloat. However, this comes at the cost of requiring | ||
slower virtual function calls, and effectively inhibiting any chance of | ||
inlining and related optimisations from occurring. | ||
|
||
Trait objects are both simple and complicated: their core representation and | ||
layout is quite straight-forward, but there are some curly error messages and | ||
surprising behaviours to discover. | ||
|
||
### Obtaining a trait object | ||
|
||
There's two similar ways to get a trait object value: casts and coercions. If | ||
`T` is a type that implements a trait `Foo` (e.g. `u8` for the `Foo` above), | ||
then the two ways to get a `Foo` trait object out of a pointer to `T` look | ||
like: | ||
|
||
```{rust,ignore} | ||
let ref_to_t: &T = ...; | ||
|
||
// `as` keyword for casting | ||
let cast = ref_to_t as &Foo; | ||
|
||
// using a `&T` in a place that has a known type of `&Foo` will implicitly coerce: | ||
let coerce: &Foo = ref_to_t; | ||
|
||
fn also_coerce(_unused: &Foo) {} | ||
also_coerce(ref_to_t); | ||
``` | ||
|
||
These trait object coercions and casts also work for pointers like `&mut T` to | ||
`&mut Foo` and `Box<T>` to `Box<Foo>`, but that's all at the moment. Coercions | ||
and casts are identical. | ||
|
||
This operation can be seen as "erasing" the compiler's knowledge about the | ||
specific type of the pointer, and hence trait objects are sometimes referred to | ||
"type erasure". | ||
|
||
### Representation | ||
|
||
Let's start simple, with the runtime representation of a trait object. The | ||
`std::raw` module contains structs with layouts that are the same as the | ||
complicated build-in types, [including trait objects][stdraw]: | ||
|
||
```rust | ||
# mod foo { | ||
pub struct TraitObject { | ||
pub data: *mut (), | ||
pub vtable: *mut (), | ||
} | ||
# } | ||
``` | ||
|
||
[stdraw]: ../std/raw/struct.TraitObject.html | ||
|
||
That is, a trait object like `&Foo` consists of a "data" pointer and a "vtable" | ||
pointer. | ||
|
||
The data pointer addresses the data (of some unknown type `T`) that the trait | ||
object is storing, and the vtable pointer points to the vtable ("virtual method | ||
table") corresponding to the implementation of `Foo` for `T`. | ||
|
||
|
||
A vtable is essentially a struct of function pointers, pointing to the concrete | ||
piece of machine code for each method in the implementation. A method call like | ||
`trait_object.method()` will retrieve the correct pointer out of the vtable and | ||
then do a dynamic call of it. For example: | ||
|
||
```{rust,ignore} | ||
struct FooVtable { | ||
destructor: fn(*mut ()), | ||
size: usize, | ||
align: usize, | ||
method: fn(*const ()) -> String, | ||
} | ||
|
||
// u8: | ||
|
||
fn call_method_on_u8(x: *const ()) -> String { | ||
// the compiler guarantees that this function is only called | ||
// with `x` pointing to a u8 | ||
let byte: &u8 = unsafe { &*(x as *const u8) }; | ||
|
||
byte.method() | ||
} | ||
|
||
static Foo_for_u8_vtable: FooVtable = FooVtable { | ||
destructor: /* compiler magic */, | ||
size: 1, | ||
align: 1, | ||
|
||
// cast to a function pointer | ||
method: call_method_on_u8 as fn(*const ()) -> String, | ||
}; | ||
|
||
|
||
// String: | ||
|
||
fn call_method_on_String(x: *const ()) -> String { | ||
// the compiler guarantees that this function is only called | ||
// with `x` pointing to a String | ||
let string: &String = unsafe { &*(x as *const String) }; | ||
|
||
string.method() | ||
} | ||
|
||
static Foo_for_String_vtable: FooVtable = FooVtable { | ||
destructor: /* compiler magic */, | ||
// values for a 64-bit computer, halve them for 32-bit ones | ||
size: 24, | ||
align: 8, | ||
|
||
method: call_method_on_String as fn(*const ()) -> String, | ||
}; | ||
``` | ||
|
||
The `destructor` field in each vtable points to a function that will clean up | ||
any resources of the vtable's type, for `u8` it is trivial, but for `String` it | ||
will free the memory. This is necessary for owning trait objects like | ||
`Box<Foo>`, which need to clean-up both the `Box` allocation and as well as the | ||
internal type when they go out of scope. The `size` and `align` fields store | ||
the size of the erased type, and its alignment requirements; these are | ||
essentially unused at the moment since the information is embedded in the | ||
destructor, but will be used in future, as trait objects are progressively made | ||
more flexible. | ||
|
||
Suppose we've got some values that implement `Foo`, the explicit form of | ||
construction and use of `Foo` trait objects might look a bit like (ignoring the | ||
type mismatches: they're all just pointers anyway): | ||
|
||
```{rust,ignore} | ||
let a: String = "foo".to_string(); | ||
let x: u8 = 1; | ||
|
||
// let b: &Foo = &a; | ||
let b = TraitObject { | ||
// store the data | ||
data: &a, | ||
// store the methods | ||
vtable: &Foo_for_String_vtable | ||
}; | ||
|
||
// let y: &Foo = x; | ||
let y = TraitObject { | ||
// store the data | ||
data: &x, | ||
// store the methods | ||
vtable: &Foo_for_u8_vtable | ||
}; | ||
|
||
// b.method(); | ||
(b.vtable.method)(b.data); | ||
|
||
// y.method(); | ||
(y.vtable.method)(y.data); | ||
``` | ||
|
||
If `b` or `y` were owning trait objects (`Box<Foo>`), there would be a | ||
`(b.vtable.destructor)(b.data)` (respectively `y`) call when they went out of | ||
scope. | ||
|
||
### Why pointers? | ||
|
||
The use of language like "fat pointer" implies that a trait object is | ||
always a pointer of some form, but why? | ||
|
||
Rust does not put things behind a pointer by default, unlike many managed | ||
languages, so types can have different sizes. Knowing the size of the value at | ||
compile time is important for things like passing it as an argument to a | ||
function, moving it about on the stack and allocating (and deallocating) space | ||
on the heap to store it. | ||
|
||
For `Foo`, we would need to have a value that could be at least either a | ||
`String` (24 bytes) or a `u8` (1 byte), as well as any other type for which | ||
dependent crates may implement `Foo` (any number of bytes at all). There's no | ||
way to guarantee that this last point can work if the values are stored without | ||
a pointer, because those other types can be arbitrarily large. | ||
|
||
Putting the value behind a pointer means the size of the value is not relevant | ||
when we are tossing a trait object around, only the size of the pointer itself. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/does a dynamic/does a dynamic call/?