Experiment with using `HashMap::with_capacity` throughout the compiler

We have found [several](https://github.com/rust-lang/rust/issues/135477) [times](https://github.com/rust-lang/rust/pull/124940#issuecomment-2103525141) that rustc's performance is basically a giant benchmark of our hashmaps. @orlp notes:
> Resizing the hash table during building basically means all the work you did up until that point was to calculate a single boolean value: your capacity wasn't enough, and nothing else of value was done .

If we can avoid resizing our hashmaps, that will avoid quite a lot of needless hashing. Given that so much of our benchmarks are made up of hashing, that could have a significant overall performance improvement. @orlp wrote this helpful benchmark giving an idea of the order of magnitude improvement we're talking about:

<details>

```rust
fn main() {
    let start = std::time::Instant::now();
    for _ in 0..100 {
        let mut hs = HashSet::new();
        for x in 0..1_000_000 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("no capacity: {:?}", start.elapsed());

    let start = std::time::Instant::now();
    for _ in 0..100 {
        let mut hs = HashSet::with_capacity(1_000_000);
        for x in 0..1_000_000 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("perfect capacity: {:?}", start.elapsed());

    let start = std::time::Instant::now();
    for _ in 0..100 {
        let mut hs = HashSet::with_capacity(1_000_000);
        for x in 0..hs.capacity() + 1 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("worst-case underestimation: {:?}", start.elapsed());

    let start = std::time::Instant::now();
    for _ in 0..100 {
        let hasher = FixedState::with_seed(42);
        let mut sketch = CardinalitySketch::new();
        for x in 0..1_000_000 {
            sketch.insert(hasher.hash_one(x));
        }

        let mut hs = HashSet::with_capacity(sketch.estimate() * 5 / 4);
        for x in 0..1_000_000 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("cardinality estimate: {:?}", start.elapsed());
}
```

</details>

```
no capacity: 1.582721584s
perfect capacity: 337.567208ms
worst-case underestimation: 2.038280209s
cardinality estimate: 412.6665ms
```

---

There are two main parts to this issue.
1. Determine whether making changes here will be an improvement (I feel fairly confident about this, but it's nice to have data). This could be done by either:
    -  Looking at a performance sample to see which functions are hot, and seeing how much time resizing functions are taking up (i think the relevant function is `hashbrown::raw::RawTableInner::resize_inner`). This will also let us know which maps are most in need of pre-allocating (which will be helpful in part 2).
    - Using a very liberal estimate of how much capacity we need (say, 10x the number of items in the crate for each hashmap) and doing a perf run on that to see if we get any improvements.
2. Determining an upper bound for each hashmap so we don't allocate more memory than we need. I expect this to be the hard part.

(and then, of course, actually switching to using `with_capacity`)

@rustbot label T-compiler I-slow E-medium E-mentor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with using `HashMap::with_capacity` throughout the compiler #137005

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Experiment with using HashMap::with_capacity throughout the compiler #137005

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Experiment with using `HashMap::with_capacity` throughout the compiler #137005