Description
The tuple-stress benchmark appears to be ridiculously slow with NLL. Profiling suggests that the majority of costs come from the liveness constraint generation code:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 36 to 42 in 860d169
Specifically, the vast majority of samples (50%) occur in the push_type_live_constraint
function:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 158 to 163 in 860d169
This function primarily consists of a walk over all the free regions within a type:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 170 to 172 in 860d169
However, the types in question don't really involve regions (they are things like (u32, f64, u32)
etc). It turns out that we have a "flags" mechanism that tracks the content of types, designed for just such a purpose. This should allow us to quickly skip. The flags are defined here, using the bitflags!
macro:
Lines 418 to 419 in 860d169
The flag we are interested in HAS_FREE_REGIONS
:
Lines 432 to 434 in 860d169
We should be able to optimize the for_each_free_region
to consult this flag and quickly skip past types that do not contain any regions. for_each_free_region
is defined here:
Lines 256 to 260 in 860d169
It uses a "type visitor" to do its work:
Lines 289 to 290 in 860d169
we want to add callback for the case of visiting types which will check this flag. Something like the following ought to do it:
fn visit_ty(&mut self, ty: Ty<'tcx>) -> bool {
if ty.flags.intersects(HAS_FREE_REGIONS) {
self.super_ty(ty)
} else {
false // keep visiting
}
}