Description
According to perf.rust-lang.org, a "Clean" build of keccak-check
has a max-rss
of 637 MB. Here's a Massif profile of the heap memory usage.
The spike is due to a single allocation of 500,363,244 bytes here:
rust/src/librustc/middle/liveness.rs
Line 601 in 28bcffe
Each vector element is a
Users
, which is a three field struct taking up 12 bytes. num_live_nodes
is 16,371, and num_vars
is 2,547, and 12 * 16,371 * 2,547 = 500,363,244.
I have one idea to improve this: Users
is a triple contains two u32
s and a bool
, which means that it is 96 bytes even though it only contains 65 bytes of data. If we split it up so we have 3 vectors instead of a vector of triples, we'd end up with 4 * 16,371 * 2,547 + 4 * 16,371 * 2,547 + 1 * 16,371 * 2,547 = 375,272,433, which is a reduction of 125,090,811 bytes. This would get max-rss
down from 637MB to 512MB, a reduction of 20%.
Alternatively, if we packed the bool
s into a bitset we could get it down to 338,787,613 bytes, which is a reduction of 161,575,631 bytes. This would get max-rss
down from 637MB to 476MB, a reduction of 25%. But it might slow things down... depends if the improved locality is outweighed by the extra instructions needs for bit manipulations.
@nikomatsakis: do you have any ideas for improving this on the algorithmic side? Is this dense num_live_nodes * num_vars
representation avoidable?