Description
This is a tracking issue dedicated to discussing ideas and techniques for improving compile-time with the NLL feature. It's meant to house as a repository of measurements, benchmarks, etc for the time being to help coordinate efforts.
Benchmarks and timing runs
We can view performance results on perf.rust-lang.org now. Just look for the "NLL" runs. They can be compared against "clean" runs -- the delta is the 'extra work' we are doing when NLL is enabled (note that NLL still runs the old region analysis and borrow check, so it does strictly more work).
Ideas for improvement or measurement
- Introduce a dirty list (Make region inference use a dirty list #47766)
- More detailed profiling and analysis of the results above
- Measure: what percentage of the time do calls to
dfs
calls wind up actually adding new info? - Idea: modify
dfs
code not to invokesuccessors()
, which can allocate, but instead add some kind of non-allocating form (perhaps with a callback?) - Experiment with sparse representation of values (use sparse bitsets instead of dense ones for NLL results #48170)
- ... others? Discuss below, that's what this issue is for =)
Quick pointers into the source
The main source of time for NLL is probably going to be the code that propagates constraints:
rust/src/librustc_mir/borrow_check/nll/region_infer/mod.rs
Lines 453 to 457 in fe7e1a4
And in particular the calls to dfs
:
rust/src/librustc_mir/borrow_check/nll/region_infer/mod.rs
Lines 486 to 495 in fe7e1a4
The dfs
code must walk the graph to find which points should be added and where:
rust/src/librustc_mir/borrow_check/nll/region_infer/dfs.rs
Lines 37 to 40 in fe7e1a4
cc @rust-lang/wg-compiler-nll