Description
Proposal
Summary and problem statement
Correct C++/Rust interop involves extensive use of ManuallyDrop
to control Rust destruction semantics and match with C++, creating unpleasant API surfaces.
Motivation, use-cases, and solution sketches
I'm working on "high-fidelity" C++/Rust interop. In particular, if C++ can work with values, I would like Rust code to be able to do the same, so as to allow seamless migration and interop between the two languages without unexpected design or performance changes. This is possible, even simple, using a library similar to moveit
.
However, the semantics around object destruction complicate this, forcing pervasive use of ManuallyDrop
.
What's so bad about ManuallyDrop
?
It makes the structs difficult to work with. For example, here is normal Rust or C++ code:
// Rust
let mut foo = Foo { x };
foo.x = y;
// C++
auto foo = Foo { .x = x };
foo.x = y;
But with ManuallyDrop
, it becomes:
let mut foo = Foo {x: ManuallyDrop::new(x)};
*foo.x = y;
This ends up applying to every non-Copy
field in every C++ struct.
We would not accept that kind of user experience for native Rust types, and so, I believe, we should not accept it for FFI C++ types. Ideally, the user experience should be the same, even if there is some increased complexity behind the scenes when automatically generating C++ bindings (e.g. automatically generated Drop
implementations.) The more difficult it is to use from Rust, the harder it is to port existing C++ code to Rust, or justify writing new Rust code in an existing C++ codebase.
Destruction Order
C++, unlike Rust, has a fixed initialization and destruction order: fields are initialized in declaration order, and destroyed in the reverse order. This causes the lifetimes of each field to be properly nested: for a struct S {T1 x; T2 y;};
, the lifetime of y
is within the lifetime of x
. This allows self-referential structs to have fields that contain references to other fields above them, without any risk of undefined behavior.
In Rust, initialization order is not fixed (fields are initialized in the order they are specified in the expression that creates the struct), and so the destruction order can be anything. Rust chooses to destroy in declaration order, which means that for self-referential structs, you must actually place self-referential fields above the fields they refer to, and initialize in the reverse order.
For example, consider the following C++ struct:
struct S {
T1 x;
T2 y; // important that this is destroyed before x, for whatever reason (e.g. holds a reference to x).
};
The following equivalently laid out Rust struct would destroy in the incorrect order:
// WRONG
#[repr(C)]
struct S {
x: T1,
y: T2,
}
Correct destruction order, today
Today, the following Rust binding would work, because while S
is nontrivial, it does not have a user-defined destructor. (That's a common case: for example, 99.8% of structs in my (large) codebase do not have a user-defined destructor, due to the way the Google C++ Style Guide pushes type design.)
#[repr(C)]
struct S {
x: ManuallyDrop<T1>,
y: ManuallyDrop<T2>,
}
impl Drop for S {
fn drop(&mut self) {
unsafe {
y.drop();
x.drop();
}
}
}
For as long as this type is present in both C++ and Rust, one of the two languages will need to override field destruction order using the language's equivalent of Drop
, where every non-trivially-destructible (C++) or non-Copy (Rust) field is wrapped in that language's equivalent of ManuallyDrop
. This presents an ongoing API friction for users.
Note: there is no other way to work around this, today. Any C++/Rust binding which works by calling destructors of the C++ subobjects individually must manually drop in order to drop in the correct order.
Invoking foreign destructors
In both C++ and Rust, destruction occurs in two parts:
- The user-defined destructor (
Drop
in Rust) - The sub-object destruction ("drop glue" in Rust)
The only customization hook is #1: you can define Drop
in Rust or the destructor in C++, and this is run in addition to standard destruction logic of #2.
But when destroying an object, you don't get to pick and choose: in both C++ and Rust, you always destroy the entire object, including all subobjects/fields, rather than being allowed to only run part of the destruction logic and leave the rest for later. (This is true both for standard C++, as well as for specific C++ ABIs like the Itanium ABI.)
Therein lies the problem:
struct Foo {
std::string string;
~Foo() {std::cout << "Hello, world."; }
};
If we tried to create an equivalent Rust struct which invokes the destructor via FFI, we'd be stuck:
#[repr(C)]
struct Foo {
string: CxxString,
}
impl Drop for Foo {
fn drop(&mut self) {
// what do we do here?
}
}
We can't fill in Drop::drop
correctly: there is no way to call the destructor that doesn't also result in destroying the string field, which would be a double-destroy, because that field will also be destroyed in Rust by the drop glue.
Correct foreign destruction, today
The only way around this is to wrap every non-Copy
field in a ManuallyDrop
, so that it is not destroyed by the implicit drop glue:
#[repr(C)]
struct Foo {
string: ManuallyDrop<CxxString>,
}
impl Drop for Foo {
fn drop(&mut self) {
// call C++ std::destroy_at(self) via FFI
ffi_destroy_Foo(self);
}
}
Note: there is no other way to work around this, today. Any C++/Rust binding which allows calling the C++ destructor must manually (not) drop, to avoid double-destruction.
Alternatives
Suppress the drop glue via an attribute
We could add an attribute, such as #[manually_drop]
(or #[manually_drop_fields]
), which suppresses the drop glue from dropping the whole struct: the fields will not be automatically dropped, but must be dropped by hand using e.g. std::ptr::drop_in_place
. The field is still dropped when directly assigned to, unlike ManuallyDrop
-- from a user's point of view, it looks like any other field, except that it isn't dropped automatically in the drop glue. (Since there's no sensible way to handle it, we should either forbid partial destructuring, or forbid destructuring entirely.)
This is the most general proposal, which solves all the issues above.
In this case, ManuallyDrop
would just be another type, defined as:
#[manually_drop]
struct ManuallyDrop<T>{value: T} // no Drop impl
To have C++-like field destruction order, one can mark the struct #[manually_drop]
, and manually drop each field in a Drop
impl:
#[manually_drop]
#[repr(C)]
struct Foo {
x: T1,
y: T2,
}
impl Drop for Foo {
fn drop(&mut self) {
unsafe {
std::mem::drop(&mut self.y);
std::mem::drop(&mut self.x);
}
}
}
And one can invoke a foreign destructor without double-dropping, by suppressing the drop glue with #[manually_drop]
:
#[manually_drop]
#[repr(C)]
struct Foo {
string: CxxString,
}
impl Drop for Foo {
fn drop(&mut self) {
// call C++ std::destroy_at(self) via FFI
ffi_destroy_Foo(self);
}
}
(This could also improve union ergonomics: we could permit #[manually drop] union U { x: T }
in place of union U { x: ManuallyDrop<T> }
. These features would also presumably interact in the obvious way: Unions would allow, in addition to Copy
types, any #[manually_drop]
type which has no Drop
impl. (Including, of course, ManuallyDrop
itself.))
Reverse field drop order via an attribute
For codebases following the Google C++ Style Guide, the normal case for a struct (a class with public fields) is that it does not have a user-defined destructor. Therefore, it would suffice for the majority of types whose fields are accessible from Rust to simply adopt C++ destruction order, via an attribute that specifies C++ destruction order:
#[drop_order(from_end)]
#[repr(C)]
struct S {
x: T1,
y: T2,
}
// no Drop impl needed
(I do not propose making this the default, now or ever -- it would only be useful for repr(C) structs, and only those which are representing C++ types.)
The structure of the attribute allows for expansion: presumably from_end
and from_start
are present by default. Other drop orders could be:
dont_care
oroptimal
or similar: choose the drop order based on what would be most efficient (e.g. drop in the order of the fields as they are laid out in memory).do_not_drop
ornone
: the same thing as#[manually_drop]
above.explicit
(contentious): one could e.g. write#[drop_order(explicit)] struct S { #[drop_order(1)] a: T1, #[drop_order(0)] b: T2}
.
Links and related work
moveit
- pre-1.0 issue: "should struct fields and array elements be dropped in reverse declaration order (a la C++)"
- RFC which rejected said issue: 1857-stabilize-drop-order
Initial people involved
Changelog
2022-01-06 (major)
- Changed problem statement to just be about the large amount of
ManuallyDrop
in C++ interop generally. - Changed from
DropGlue
trait to#[manually_drop]
attribute -- much simpler! Also made this the preferred alternativel - Added section on calling the C++ destructor.
- Added section on pinned unions.
- Added another helpful alternative: mark
ManuallyDrop
as structurally pinned. (This solves the pinned unions issue only.)
2022-01-10 (minor)
- Added field assignment examples to "What's so bad about ManuallyDrop?"
#[manually_drop]
: assignment still drops the field, should forbid (partial) destructuring.
2022-01-11
- Fixed incorrect
ManuallyDrop
assignment example - Added links to earlier discussions of drop order.
2022-01-20
- Removed discussion of
ManuallyDrop
structural pin projection, unions, etc., as they are solvable without language changes.
Summary of discussion in Zulip
<=2022-01-06
- Aaron Hill: "I think it would be useful to make the attribute name be less ambiguous for someone who doesn't know the 'standard' drop order. Maybe something like
#[drop_fields_reversed]
or#[drop_from_end]
?" - nagisa: Proposal insufficiently justified, doesn't explain why
ManuallyDrop
, which is designed for this, isn't sufficient. - Devin Jeanpierre (it me):
ManuallyDrop
docs say not to use it for this, but to use implicit drop order. Also presents an API surface we wouldn't consider acceptable in non-FFI code. - Nick Cameron: "I'd rather see a more general version, something like a
manually_drop
attribute which means that all
fields are implicitly manually dropped and then it is the responsibility of the author to drop them in the destructor." - Jacob Lifshay: "this would also allow converting
ManuallyDrop
to a pure library type, no lang item needed" - Devin Jeanpierre: That would also solve issues where
ManuallyDrop
doesn't guarantee it's structurally pinned -- users could write their ownManuallyDrop
type which is, and it would be usable inunion
. - Daniel Henry-Mantilla: Ideally would be able to specify e.g.
fn drop (bar: Pin<&move Bar>, baz: Pin<&move Baz>)
, to avoid all the unsafe code that comes with#[manually_drop]
<= 2022-01-10
- Jake Goulding: not clear on why a proc macro isn't sufficient. Also, does this forbid destructuring?
- tm: Do fields still get implicitly dropped on assignment? What about with incomplete destructuring?
- bjorn3: Not very useful without a
Drop
impl, which suppresses destructuring, so the destructuring question isn't critical. - Connor Herman: forbid assignment, defer question to later.
- Devin Jeanpierre: making e.g. assignment ergonomic is the whole point, let's allow it and make it implicitly drop.
#[manually_drop]
only suppresses drop glue, not assignment etc. (Also, should probably suppress destructuring, because that would be unsafe.)
- bjorn3: Not very useful without a
<= 2022-01-11
- Example is wrong; don't need to manually drop on assignment with ManuallyDrop, can use DerefMut instead.
- scottmcm: at least two drop orders: forward backward. Could be better to use
drop_order(backward)
(ordrop_order(from_end)
) rather thandrop_from_end
. Needs a third order to justify -- perhaps "do_not_drop_fields
", which is what#[manually_drop]
is? (Note: "dont_care
" is suggested later.) - Jake Goulding: also potentially manually list out order -- e.g.
#[drop_order(explicit)] struct S { #[drop_order(1)] a: T1, #[drop_order(0)] b: T2}
<= 2022-01-17
- Nick Cameron: Prefer that if you want that level of customization, you do it entirely by hand, not with attributes. Also lets you set drop order dynamically.
- Some side discussion on C++ trivial ABI vs non-trivial-ABI types. See also https://reviews.llvm.org/D114732.
- Jake Goulding: "My problem is that implementing Drop also means that I no longer can destructure. That’s the angle I’d love to see addressed." (TODO: address in this proposal?)
- cuviper: In addition to forward/backward could have
drop_order(dont_care)
which drops in the most efficient order. - Devin Jeanpierre: from_end/from_start rather than forward/backward?
<= 2022-01-20
- Offline discussion (ACL'd link) with Alice Ryhl.
ManuallyDrop
cannot be (safely) structurally pinned due to the drop guarantee -- you'd need to guarantee the pinned interior is not forgotten.- Regardless of choice of structural pinning, if the
ManuallyDrop
is private, then you can pin-project it manually and uphold drop guarantee and other pin invariants. So e.g. to create a binding for a union, you would instead expose a newtype wrapper around a union, with accessor methods that do not directly exposePin<ManuallyDrop<T>>
for the fields. - Conclusion overall: this is not an unsolvable problem / doesn't need rust changes. (And so I've removed discussion of this from the proposal.)