Compiler performance when compiling built-in derives is worse than desired

This is a follow-up issue to #80050 which was closed due to the implementation not being 100% correct. 

Currently, adding `#[derive]` annotations for built-in traits such as a `PartialOrd` or `Debug` causes slower compile times than if the user implements these traits by hand. We should aim for getting these derives to be as close to "free" as possible when it comes to amount of time it takes to compile the code. 

## The Current State 

We currently track the performance of derives in the [rustc-perf "derives" benchmark](https://github.com/rust-lang/rustc-perf/blob/e095f5021bf01cf3800f50b3a9f14a9683eb3e4e/collector/benchmarks/derive/src/lib.rs). 

`PartialOrd` is by far the most expensive to derive of the traits in std. The derive benchmark with `#[derive(PartialEq, PartialOrd)]` takes 9.1s on my machine while just `#[derive(PartialEq)]` takes 1.9s. 

The following is a short experiment where we create a struct with one field 10,000 times and benchmark what happens when the structs have various derives on them. The code is generated using the following ruby script:
```ruby
File.open('src/lib.rs', 'w') do |file|
  0..10_000.times do |n|
      file.write("pub struct MyType#{n} { pub field: i32 }\n")
  end
end
```
* Base (i.e., no derives): 061.s
* `#[derive(Debug)]`: 13.33s
* `#[derive(PartialEq)]`: 9.78s
* `#[derive(PartialEq, PartialOrd)]`: **47.46s**
* `#[derive(Clone)]`: 6.32s
*  `#[derive(Clone, Copy)]`: 5.32s
* `#[derive(PartialEq, Eq)]`: 14.13s
* `#[derive(Default)]`: 4.33s
* `#[derive(Hash)]`: 6.52s

It should be noted that it does not appear that any of the overhead is coming from the code expansion itself. When compiling the traits with the code copy/pasted from output from `cargo expand`, the performance is comparable to deriving the traits. 

However, there's definitely some wiggle room beyond "just making compilation in general faster, will make derives faster". I manually implemented `Debug` and ran the test again, and it compiles 10% faster. 

It seems like the plurality of time (~13%)  for most of these derives is being spent in type checking though a deeper investigation into each class of derive is warranted. You can see a `summarize` comparison of `derive` vs. expanded vs manually implemented [here](https://gist.github.com/rylev/14430d2e1ea384b1f10efae6b5c385f2). 

## Discussion

This topic is already being discussed [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/247081-t-compiler.2Fperformance/topic/Slow.20Builtin.20Derives). 

Some discussion has suggested we might want to look into MIR shims as a possible solution but since MIR still needs to undergo type checking and we're spending most of the time in type checking, this probably won't help. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiler performance when compiling built-in derives is worse than desired #80118

The Current State

Discussion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Compiler performance when compiling built-in derives is worse than desired #80118

Description

The Current State

Discussion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions