Description
Experiments done on https://godbolt.org/ with the nightly toolchain and flags -C opt-level=3 --edition=2021
. Also reproducible with rustc 1.66. Results seem to be the same on both x86_64 and aarch64.
- rustc 1.66:
rustc 1.66.0 (69f9c33d7 2022-12-12)
binary: rustc
commit-hash: 69f9c33d71c871fc16ac445211281c6e7a340943
commit-date: 2022-12-12
host: x86_64-unknown-linux-gnu
release: 1.66.0
LLVM version: 15.0.2
Compiler returned: 0
- rustc nightly:
rustc 1.68.0-nightly (0b90256ad 2023-01-13)
binary: rustc
commit-hash: 0b90256ada21c6a81b4c18f2c7a23151ab5fc232
commit-date: 2023-01-13
host: x86_64-unknown-linux-gnu
release: 1.68.0-nightly
LLVM version: 15.0.6
Compiler returned: 0
The expected generated code is something like:
example::foo:
mov eax, 94
ret
In this code, rustc/LLVM fails to see through the 2 levels of match
required to compute the return value of 94 at compile time when MyEnum
has a MyEnum::B
variant containing a Box:
pub enum MyEnum {
A(u32),
// Uncommenting that will wreak the generated code
// B(Box<MyEnum>)
}
pub fn foo() -> u32 {
let x = MyEnum::A(87);
let y = {
let y1 = MyEnum::A(4);
let y2 = MyEnum::A(3);
match (y1, y2) {
(MyEnum::A(x), MyEnum::A(y)) => MyEnum::A(x + y),
_ => panic!()
}
};
let y = MyEnum::A(7);
match (x, y) {
(MyEnum::A(x), MyEnum::A(y)) => x + y,
_ => panic!(),
}
}
This does not break if Enum::B(Box<u32>)
is used, but breaks again for Box<MyEnum>
or Box<S>
with S
being a newtype struct:
pub struct S(u32);
impl Drop for S {
#[inline(never)]
fn drop(&mut self) {
println!("hello");
}
}
So it looks like whenever there is Box variant with T containing a non-inlineable drop (either because of side effect or because it's recursive), the optim breaks.
Also, even worse, using an alias in foo
such as type MyEnum2 = MyEnum;
makes the optim break as well for MyEnum::B(Box<u32>)
, which is a regression compared to the non-aliased use, i.e.:
pub enum MyEnum {
A(u32),
// Box of a !Drop type optimized fine in previous example, but breaks here when a `type MyEnum2 = MyEnum;` in `foo()`
B(Box<u32>)
}
pub fn foo() -> u32 {
// Rewrite the function to use `MyEnum` instead of the alias and the generated code is good. Use the alias such as in this example and the optim breaks.
type MyEnum2 = MyEnum;
let x = MyEnum2::A(87);
let y = {
let y1 = MyEnum2::A(4);
let y2 = MyEnum2::A(3);
match (y1, y2) {
(MyEnum2::A(x), MyEnum2::A(y)) => MyEnum2::A(x + y),
_ => panic!()
}
};
let y = MyEnum2::A(7);
match (x, y) {
(MyEnum2::A(x), MyEnum2::A(y)) => x + y,
_ => panic!(),
}
}
Note that in all these examples, rustc/LLVM is able to "see through" the leaf level of match, i.e. it will optimize y1 + y2
into MyEnum::A(7)
, but then it won't do the same for the outer layer x + y
.