experiment with "inverting" liveness to improve perf of html5ever

html5ever profiles are dominated by code that "applies" the result of computing liveness. In thinking over how this code works, it seems like it might be better to "invert". Currently what happens is this:

- The `util/liveness` code computes, for each point in the CFG, the set of live variables at each point
  - but we want to distinguish two kinds of liveness: live because it will be used, and live because it will be dropped
  - so actually we run the analysis twice 
  - these results are stored as a bitset of variables per program point
- Then the NLL typeck code goes over each program point. For each point, it looks at each live variable at that point, and identifies the regions within that variable's type. Those regions are set to be live at that point.
  - these results are stored as a bitset of program points per region

You can see here the mismatch: in the first case, we store a bitset of variables per point, and in the later, a bitset of points per region. This results in a lot of "conversion" instructions.

I was thinking that it might be more efficient to do the liveness in an inverted sense. Instead of re-using the `util/liveness` code, we would do the following:

- Do one pass over the CFG to create an index mapping each variable to the points where it is referenced (this ought to be roughly an O(n) walk).
- For each variable V whose type T contains regions:
  - We would walk **backward** from each use to find the defs that dominate them.
  - Each block that we visit is a place where V is live, so we can accumulate those blocks into a sparse bit set. 
    - Depending on how we organize the index of uses above, we can make this particularly efficient I imagine, since we should be able to cheaply answer the question "is V defined anywhere in this basic block".
    - We can also handle the drop-live vs use-live distinction as part of this walk in a relatively straightforward fashion I imagine. When we are walking backwards from a drop, we mark points as drop-live and stop at any use *or* definition (since, from that point back, the value is either use-live or not live at all). 
- Once we are done we should have a bit set of points where V is live and a bit set of drop-live points.
  - We then iterate over region R in the type of T and merge that live-set into its value.
  - Similarly for the drop-live points, but only over the relevant regions.

This isn't fundamentally different but it has a few potential advantages:

- We enumerate the regions in the type of each variable ONCE -- and when we do it, we have the full set of live points in hand
- We don't have to do two liveness computations or deal with the somewhat messy job of stitching them back together

One complication I remember now is that for drop-live we also do need the results of initialization at each point. That could be a bit of a pain to reconcile, have to think about it.

All in all, not an easy change, but maybe a good one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiment with "inverting" liveness to improve perf of html5ever #52460

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

experiment with "inverting" liveness to improve perf of html5ever #52460

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions