Description
One of the compiler performance issues I found in a codebase involved a big session of type inference + LUBbing in code creating a Map[A, Seq[Class[_ <: B]]
, in which there was enough of a lattice in the actual subtypes of B
to make LUB computation expensive.
I distilled this into a benchmark corpus:
That takes about 2s to compile. The original example was 10x that.
Here’s how the profiler sees things:
https://scala-ci.typesafe.com/view/scala-bench/job/compiler-benchmark/2101/artifact/target/profile-jfr/flame-graph-cpu.svg
https://scala-ci.typesafe.com/view/scala-bench/job/compiler-benchmark/2101/artifact/target/profile-jfr/flame-graph-cpu-reverse.svg
I didn’t go as far as characterizing the big-O complexity wrt different N in the example. My working assumption is that primary N is the list of the Seq.apply
argument list, and I imagine its O(N^2)
.
The constant factor multiplied by that complexity is existential subtyping, which creates temporary types by substituting type vars in place of existential quantifiers. These create some pressure on the uniques cache of types.
I once tried the Dotty approach of excluding types we know to be temporary from the uniques cache: scala/scala@2.13.x...retronym:faster/strong-selective-unique
We still have to allocate to create the type, but by avoiding keeping the weak reference from the uniques map to the short-lived type, we make it eligible for GC almost immediately. I never saw enough benefit from this in big benchmarks to make me follow through with it, but maybe with a focussed benchmark like this one we could measure the improvement and justify the change.