Skip to content

incr.comp.: Explore delayed read-edge deduplication or getting rid of it entirely. #45873

Closed
@michaelwoerister

Description

@michaelwoerister

At the moment the compiler will always eagerly deduplicate read-edges in CurrentDepGraph::read_index. In order to be able to do this the compiler has to allocate a HashSet for each task. My suspicion is that there is not much duplication to begin with and that there's potential for optimization here:

  • If there is not much duplication to begin with we could skip deduplication entirely for non-anonymous nodes. There should be no semantic difference and a slight performance hit only for actually duplicated nodes.
  • If deduplication seems like a good idea, doing so during serialization instead of immediately has a few potential benefits:
    • Dep-graph serialization happens while LLVM is running in parallel, so more often than not any work done there might not add to the actual compile-time.
    • When compilation exits early with an error, we don't save the dep-graph at all and the work for deduplication has been wasted. If compilation already reached dep-graph serialization, the chances are high that it will succeed.
    • When we already have the whole dep-graph we can implement deduplication more efficiently because we know when we don't even have to allocate a hash set at all or, if we have to, we know which capacity will suffice.

So the first step would be to collect some data on how much read-edge duplication there even is. This is most easily done by modifying the compiler to count duplicates (in librustc/dep_graph/graph.rs) and print the number/percentage in -Zincremental-info. This modified compiler can then be used to compile a number of crates to get an idea what's going on.

If duplication is low (e.g. less than 2% of reads got filtered out), then we could just remove de-duplication and test the effect with a try-build. Otherwise, we can move deduplication to DepGraph::serialize() and measure the performance impact of that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-incr-compArea: Incremental compilationC-enhancementCategory: An issue proposing an enhancement or a PR with one.E-mentorCall for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion.WG-incr-compWorking group: Incremental compilation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions