Description
TL;DR
Incremental compilation currently will always create 1-2 codegen units per source-level module, regardless of the -Ccodegen-units
setting passed to the compiler. This is fine in the majority of cases but there is no way to control this behavior in cases where it produces too much overhead (see below for examples).
I propose to
- make the compiler honor the
-Ccodegen-units
setting, even when compiling with-Cincremental
, and - make the compiler default to a higher number of codegen units in case incr. comp. is enabled (256 instead of 16).
The -Ccodegen-units
flag would retain exactly the same semantics it has in non-incremental mode, i.e. setting an upper bound for the number codegen units.
Why do I consider this a (possibly) major change? Because there is one case where the compiler changes behavior: If someone has explicitly set the number of codegen units. After this change, that setting will start to have an effect, leading to potentially higher compile times. Only one crate in the perf.rlo benchmark suite has such an explicit setting (clap-rs
). I expect the fallout to be minor and harmless.
Also note that this opens up a whole new use case for incremental compilation: By setting -Ccodegen-units=1
(or -Ccodegen-units=16
as is the default right now), the compiler can make use of the incremental cache for all of the middle end while producing a binary that exhibits the same runtime performance as a non-incrementally built one.
Links and Details
I ran experiments for this in rust-lang/rust#67834 and the results look good:
- up to 30% compile time reduction for
script-servo-debug
- up to 15% compile time reduction for
style-servo-debug
- up to 8% compile time reduction for
style-servo-opt
However, there are also cases that regress:
patched incremental: debugging println in dependency
inscript-servo-opt
regresses by 8% due to the lower cache granularity.clap-rs
regresses by up to 34.7% because it has an explicit (and previously ignored)codegen-units
setting in itsCargo.toml
. That is easily fixable byclap-rs
itself.
These regressions are acceptable, I think, especially because the user can easily regain the previous behavior by setting -Ccodegen-units=9999
(or some other number that is greater than the number of source-level modules times 2). Most crates, however, are well below the default setting of 256 codegen units and won't see any kind of changed behavior.
Mentors or Reviewers
The implementation should be straightforward so a reviewer would mostly need to sign off on making the -Ccodegen-units
flag suddenly take an effect in incremental mode. @nikomatsakis & @pnkfelix as compiler team leads would be good candidates for that.