Support dataflow problems on arbitrary lattices #76044

ecstatic-morse · 2020-08-28T23:31:29Z

This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just BitSets. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g. size_of::<T>()) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc @rust-lang/wg-mir-opt.

The Idx associated type is moved from AnalysisDomain to GenKillAnalysis and replaced with an associated Domain type that must implement JoinSemiLattice. Like before, each Analysis defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use the lattice::Dual newtype. GenKillAnalysis impls have an additional requirement that Self::Domain: BorrowMut<BitSet<Self::Idx>>, which effectively means that they must use BitSet<Self::Idx> or lattice::Dual<BitSet<Self::Idx>> as their domain.

Most of these changes were mechanical. However, because a Domain is no longer always a powerset of some index type, we can no longer use an IndexVec<BasicBlock, GenKillSet<A::Idx>>> to store cached block transfer functions. Instead, we use a boxed dyn Fn trait object. I discuss a few alternatives to the current approach in a commit message.

The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of MovePathIndexs, which are used in many dataflow analyses. This required a parallel fmt trait used only for printing dataflow domains, as well as a refactoring of the graphviz module now that we cannot expect the domain to be a BitSet. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to a Visitor for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses.

r? @pnkfelix

ecstatic-morse · 2020-08-28T23:43:22Z

src/librustc_mir/dataflow/framework/mod.rs

-    /// It is almost certainly wrong to override this, since it automatically applies
-    /// * `inout_set & in_set` if `BOTTOM_VALUE == true`
-    /// * `inout_set | in_set` if `BOTTOM_VALUE == false`
-    ///
-    /// This means that if a bit is not `BOTTOM_VALUE`, it is propagated into all target blocks.
-    /// For clarity, the above statement again from a different perspective:
-    /// A bit in the block's entry set is `!BOTTOM_VALUE` if *any* predecessor block's bit value is
-    /// `!BOTTOM_VALUE`.
-    ///
-    /// There are situations where you want the opposite behaviour: propagate only if *all*
-    /// predecessor blocks's value is `!BOTTOM_VALUE`.
-    /// E.g. if you want to know whether a bit is *definitely* set at a specific location. This
-    /// means that all code paths leading to the location must have set the bit, instead of any
-    /// code path leading there.
-    ///
-    /// If you want this kind of "definitely set" analysis, you need to
-    /// 1. Invert `BOTTOM_VALUE`
-    /// 2. Reset the `entry_set` in `start_block_effect` to `!BOTTOM_VALUE`
-    /// 3. Override `join` to do the opposite from what it's doing now.


This comment has not been preserved, since I didn't write it and it is very much tied to the existing nomenclature. Perhaps there's a way to distill it into a form that is compatible with the new interface @oli-obk?

Nah, this is fine, the new interface allows us to do this properly instead of dancing with booleans.

ecstatic-morse · 2020-08-30T00:40:36Z

@bors try
@rust-timer queue

rust-timer · 2020-08-30T00:40:37Z

Awaiting bors try build completion

bors · 2020-08-30T00:40:52Z

⌛ Trying commit e47c22b with merge 710b62932438d91c6d28287e41dc5657797baf8e...

bors · 2020-08-30T01:22:18Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: 710b62932438d91c6d28287e41dc5657797baf8e (710b62932438d91c6d28287e41dc5657797baf8e)

rust-timer · 2020-08-30T01:22:20Z

Queued 710b62932438d91c6d28287e41dc5657797baf8e with parent 5c27700, future comparison URL.

rust-timer · 2020-08-30T04:33:00Z

Finished benchmarking try commit (710b62932438d91c6d28287e41dc5657797baf8e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

bjorn3 · 2020-08-30T07:18:12Z

Mixed <1% improvements and regressions. Likely noise.

A few small cleanups to `BitSet` and friends: - Overload `clone_from` for `BitSet`. - Improve `Debug` represenation of `HybridBitSet`. - Make `HybridBitSet::domain_size` public. - Don't require `T: Idx` at the type level. The `Idx` bound is still on most `BitSet` methods, but like `HashMap`, it doesn't need to be satisfied for the type to exist.

I've tried a few ways of implementing this, but each fell short. Adding an auxiliary `_Idx` associated type to `Analysis` that defaults to `!` but is overridden in the blanket impl of `Analysis` for `A: GenKillAnalysis` to `A::Idx` seems promising, but the trait solver is unable to prove equivalence between `A::Idx` and `A::_Idx` within the overridden version of `into_engine`. Without full-featured specialization, removing `into_engine` or splitting it into a different trait would have a significant ergonomic penalty. Alternatively, we could erase the index type and store a `GenKillSet<u32>` as well as a function pointer for transmuting between `&mut A::Domain` and `&mut BitSet<u32>` in the hopes that LLVM can devirtualize a simple function pointer better than the boxed closure. However, this is brittle, requires `unsafe` code, and doesn't work for index types that aren't the same size as a `u32` (e.g. `usize`) since `GenKillSet` stores a `HybridBitSet`, which may be a `Vec<I>`. Perhaps safe transmute could help here?

oli-obk · 2020-09-07T14:25:26Z

I'm fine with the trait object indirection for now. We can do more experimentation in future PRs (from this point on other ppl can do work, idk who but you would have been able to pull this PR off).

I went through commit by commit, all the changes seem very reasonable to me. I think we may be able to simplify some lints in clippy now, too.

@bors r+

bors · 2020-09-07T14:25:28Z

📌 Commit b015109 has been approved by oli-obk

bors · 2020-09-07T21:29:48Z

⌛ Testing commit b015109 with merge 0e2c128...

bors · 2020-09-07T23:42:46Z

☀️ Test successful - checks-actions, checks-azure
Approved by: oli-obk
Pushing 0e2c128 to master...

Daniel-B-Smith · 2020-09-08T20:04:40Z

Just a heads up, this PR broke Clippy: https://github.com/rust-lang/rust-clippy/blob/master/clippy_lints/src/redundant_clone.rs#L17 I'm happy to do the fix over there, but I would need some pointers as to what the replacement API should be.

ecstatic-morse · 2020-09-08T20:19:42Z

Since the switch to subtrees, all PRs to rustc must be able to build clippy and pass its test suite. This one is no exception, and contains the necessary fix already. You need to follow the instructions for pulling in upstream changes to the rustc-clippy repo.

mati865 · 2020-09-08T20:24:24Z

@Daniel-B-Smith like @ecstatic-morse said it's already fixed for Clippy in this repository.
Soon one of Clippy team members will pull commit from here to Clippy repo.

…i-obk Support dataflow problems on arbitrary lattices This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just `BitSet`s. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g. `size_of::<T>()`) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc `@rust-lang/wg-mir-opt.` The `Idx` associated type is moved from `AnalysisDomain` to `GenKillAnalysis` and replaced with an associated `Domain` type that must implement `JoinSemiLattice`. Like before, each `Analysis` defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use the `lattice::Dual` newtype. `GenKillAnalysis` impls have an additional requirement that `Self::Domain: BorrowMut<BitSet<Self::Idx>>`, which effectively means that they must use `BitSet<Self::Idx>` or `lattice::Dual<BitSet<Self::Idx>>` as their domain. Most of these changes were mechanical. However, because a `Domain` is no longer always a powerset of some index type, we can no longer use an `IndexVec<BasicBlock, GenKillSet<A::Idx>>>` to store cached block transfer functions. Instead, we use a boxed `dyn Fn` trait object. I discuss a few alternatives to the current approach in a commit message. The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of `MovePathIndex`s, which are used in many dataflow analyses. This required a parallel `fmt` trait used only for printing dataflow domains, as well as a refactoring of the `graphviz` module now that we cannot expect the domain to be a `BitSet`. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to a `Visitor` for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses. r? `@pnkfelix`

rust-highfive assigned pnkfelix Aug 28, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 28, 2020

ecstatic-morse commented Aug 28, 2020

View reviewed changes

ecstatic-morse added 6 commits August 30, 2020 11:13

Add regex dependency to librustc_mir

a88dc37

Allow access to the underlying Results from a ResultsCursor

9e45e90

Extend dataflow framework to support arbitrary lattices

3233fb1

Update dataflow analyses to use new interface

b19b8ea

ecstatic-morse force-pushed the dataflow-lattice branch from e47c22b to c03eba2 Compare August 30, 2020 18:16

ecstatic-morse added 2 commits August 30, 2020 13:27

Expand documentation for the lattice module

e178a87

Add documentation to the Analysis traits

b015109

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 7, 2020

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 7, 2020

bors merged commit 0e2c128 into rust-lang:master Sep 7, 2020

rustbot added this to the 1.48.0 milestone Sep 7, 2020

ecstatic-morse deleted the dataflow-lattice branch October 6, 2020 01:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support dataflow problems on arbitrary lattices #76044

Support dataflow problems on arbitrary lattices #76044

ecstatic-morse commented Aug 28, 2020 •

edited

Loading

ecstatic-morse Aug 28, 2020

oli-obk Sep 7, 2020

ecstatic-morse commented Aug 30, 2020

rust-timer commented Aug 30, 2020

bors commented Aug 30, 2020

bors commented Aug 30, 2020

rust-timer commented Aug 30, 2020

rust-timer commented Aug 30, 2020

bjorn3 commented Aug 30, 2020

oli-obk commented Sep 7, 2020

bors commented Sep 7, 2020

bors commented Sep 7, 2020

bors commented Sep 7, 2020

Daniel-B-Smith commented Sep 8, 2020

ecstatic-morse commented Sep 8, 2020 •

edited

Loading

mati865 commented Sep 8, 2020

Support dataflow problems on arbitrary lattices #76044

Support dataflow problems on arbitrary lattices #76044

Conversation

ecstatic-morse commented Aug 28, 2020 • edited Loading

ecstatic-morse Aug 28, 2020

Choose a reason for hiding this comment

oli-obk Sep 7, 2020

Choose a reason for hiding this comment

ecstatic-morse commented Aug 30, 2020

rust-timer commented Aug 30, 2020

bors commented Aug 30, 2020

bors commented Aug 30, 2020

rust-timer commented Aug 30, 2020

rust-timer commented Aug 30, 2020

bjorn3 commented Aug 30, 2020

oli-obk commented Sep 7, 2020

bors commented Sep 7, 2020

bors commented Sep 7, 2020

bors commented Sep 7, 2020

Daniel-B-Smith commented Sep 8, 2020

ecstatic-morse commented Sep 8, 2020 • edited Loading

mati865 commented Sep 8, 2020

ecstatic-morse commented Aug 28, 2020 •

edited

Loading

ecstatic-morse commented Sep 8, 2020 •

edited

Loading