Skip to content

Support dataflow problems on arbitrary lattices #76044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Sep 7, 2020

Conversation

ecstatic-morse
Copy link
Contributor

@ecstatic-morse ecstatic-morse commented Aug 28, 2020

This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just BitSets. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g. size_of::<T>()) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc @rust-lang/wg-mir-opt.

The Idx associated type is moved from AnalysisDomain to GenKillAnalysis and replaced with an associated Domain type that must implement JoinSemiLattice. Like before, each Analysis defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use the lattice::Dual newtype. GenKillAnalysis impls have an additional requirement that Self::Domain: BorrowMut<BitSet<Self::Idx>>, which effectively means that they must use BitSet<Self::Idx> or lattice::Dual<BitSet<Self::Idx>> as their domain.

Most of these changes were mechanical. However, because a Domain is no longer always a powerset of some index type, we can no longer use an IndexVec<BasicBlock, GenKillSet<A::Idx>>> to store cached block transfer functions. Instead, we use a boxed dyn Fn trait object. I discuss a few alternatives to the current approach in a commit message.

The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of MovePathIndexs, which are used in many dataflow analyses. This required a parallel fmt trait used only for printing dataflow domains, as well as a refactoring of the graphviz module now that we cannot expect the domain to be a BitSet. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to a Visitor for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses.

r? @pnkfelix

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 28, 2020
Comment on lines 71 to 89
/// It is almost certainly wrong to override this, since it automatically applies
/// * `inout_set & in_set` if `BOTTOM_VALUE == true`
/// * `inout_set | in_set` if `BOTTOM_VALUE == false`
///
/// This means that if a bit is not `BOTTOM_VALUE`, it is propagated into all target blocks.
/// For clarity, the above statement again from a different perspective:
/// A bit in the block's entry set is `!BOTTOM_VALUE` if *any* predecessor block's bit value is
/// `!BOTTOM_VALUE`.
///
/// There are situations where you want the opposite behaviour: propagate only if *all*
/// predecessor blocks's value is `!BOTTOM_VALUE`.
/// E.g. if you want to know whether a bit is *definitely* set at a specific location. This
/// means that all code paths leading to the location must have set the bit, instead of any
/// code path leading there.
///
/// If you want this kind of "definitely set" analysis, you need to
/// 1. Invert `BOTTOM_VALUE`
/// 2. Reset the `entry_set` in `start_block_effect` to `!BOTTOM_VALUE`
/// 3. Override `join` to do the opposite from what it's doing now.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment has not been preserved, since I didn't write it and it is very much tied to the existing nomenclature. Perhaps there's a way to distill it into a form that is compatible with the new interface @oli-obk?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, this is fine, the new interface allows us to do this properly instead of dancing with booleans.

@ecstatic-morse
Copy link
Contributor Author

@bors try
@rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Collaborator

bors commented Aug 30, 2020

⌛ Trying commit e47c22b with merge 710b62932438d91c6d28287e41dc5657797baf8e...

@bors
Copy link
Collaborator

bors commented Aug 30, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: 710b62932438d91c6d28287e41dc5657797baf8e (710b62932438d91c6d28287e41dc5657797baf8e)

@rust-timer
Copy link
Collaborator

Queued 710b62932438d91c6d28287e41dc5657797baf8e with parent 5c27700, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (710b62932438d91c6d28287e41dc5657797baf8e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

@bjorn3
Copy link
Member

bjorn3 commented Aug 30, 2020

Mixed <1% improvements and regressions. Likely noise.

A few small cleanups to `BitSet` and friends:

- Overload `clone_from` for `BitSet`.
- Improve `Debug` represenation of `HybridBitSet`.
- Make `HybridBitSet::domain_size` public.
- Don't require `T: Idx` at the type level. The `Idx` bound is still on
  most `BitSet` methods, but like `HashMap`, it doesn't need to be
  satisfied for the type to exist.
I've tried a few ways of implementing this, but each fell short.

Adding an auxiliary `_Idx` associated type to `Analysis` that defaults
to `!` but is overridden in the blanket impl of `Analysis` for `A:
GenKillAnalysis` to `A::Idx` seems promising, but the trait solver is
unable to prove equivalence between `A::Idx` and `A::_Idx` within the
overridden version of `into_engine`. Without full-featured
specialization, removing `into_engine` or splitting it into a different
trait would have a significant ergonomic penalty.

Alternatively, we could erase the index type and store a
`GenKillSet<u32>` as well as a function pointer for transmuting between
`&mut A::Domain` and `&mut BitSet<u32>` in the hopes that LLVM can
devirtualize a simple function pointer better than the boxed closure.
However, this is brittle, requires `unsafe` code, and doesn't work for
index types that aren't the same size as a `u32` (e.g. `usize`) since
`GenKillSet` stores a `HybridBitSet`, which may be a `Vec<I>`. Perhaps
safe transmute could help here?
@oli-obk
Copy link
Contributor

oli-obk commented Sep 7, 2020

I'm fine with the trait object indirection for now. We can do more experimentation in future PRs (from this point on other ppl can do work, idk who but you would have been able to pull this PR off).

I went through commit by commit, all the changes seem very reasonable to me. I think we may be able to simplify some lints in clippy now, too.

@bors r+

@bors
Copy link
Collaborator

bors commented Sep 7, 2020

📌 Commit b015109 has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 7, 2020
@bors
Copy link
Collaborator

bors commented Sep 7, 2020

⌛ Testing commit b015109 with merge 0e2c128...

@bors
Copy link
Collaborator

bors commented Sep 7, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: oli-obk
Pushing 0e2c128 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 7, 2020
@bors bors merged commit 0e2c128 into rust-lang:master Sep 7, 2020
@rustbot rustbot added this to the 1.48.0 milestone Sep 7, 2020
@Daniel-B-Smith
Copy link
Contributor

Just a heads up, this PR broke Clippy: https://github.com/rust-lang/rust-clippy/blob/master/clippy_lints/src/redundant_clone.rs#L17 I'm happy to do the fix over there, but I would need some pointers as to what the replacement API should be.

@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented Sep 8, 2020

Since the switch to subtrees, all PRs to rustc must be able to build clippy and pass its test suite. This one is no exception, and contains the necessary fix already. You need to follow the instructions for pulling in upstream changes to the rustc-clippy repo.

@mati865
Copy link
Contributor

mati865 commented Sep 8, 2020

@Daniel-B-Smith like @ecstatic-morse said it's already fixed for Clippy in this repository.
Soon one of Clippy team members will pull commit from here to Clippy repo.

flip1995 pushed a commit to flip1995/rust that referenced this pull request Sep 10, 2020
…i-obk

Support dataflow problems on arbitrary lattices

This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just `BitSet`s. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g. `size_of::<T>()`) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc `@rust-lang/wg-mir-opt.`

The `Idx` associated type is moved from `AnalysisDomain` to `GenKillAnalysis` and replaced with an associated `Domain` type that must implement `JoinSemiLattice`. Like before, each `Analysis` defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use the `lattice::Dual` newtype. `GenKillAnalysis` impls have an additional requirement that `Self::Domain: BorrowMut<BitSet<Self::Idx>>`, which effectively means that they must use `BitSet<Self::Idx>` or `lattice::Dual<BitSet<Self::Idx>>` as their domain.

Most of these changes were mechanical. However, because a `Domain` is no longer always a powerset of some index type, we can no longer use an `IndexVec<BasicBlock, GenKillSet<A::Idx>>>` to store cached block transfer functions. Instead, we use a boxed `dyn Fn` trait object. I discuss a few alternatives to the current approach in a commit message.

The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of `MovePathIndex`s, which are used in many dataflow analyses. This required a parallel `fmt` trait used only for printing dataflow domains, as well as a refactoring of the `graphviz` module now that we cannot expect the domain to be a `BitSet`. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to a `Visitor` for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses.

r? `@pnkfelix`
@ecstatic-morse ecstatic-morse deleted the dataflow-lattice branch October 6, 2020 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants