-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Support dataflow problems on arbitrary lattices #76044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/// It is almost certainly wrong to override this, since it automatically applies | ||
/// * `inout_set & in_set` if `BOTTOM_VALUE == true` | ||
/// * `inout_set | in_set` if `BOTTOM_VALUE == false` | ||
/// | ||
/// This means that if a bit is not `BOTTOM_VALUE`, it is propagated into all target blocks. | ||
/// For clarity, the above statement again from a different perspective: | ||
/// A bit in the block's entry set is `!BOTTOM_VALUE` if *any* predecessor block's bit value is | ||
/// `!BOTTOM_VALUE`. | ||
/// | ||
/// There are situations where you want the opposite behaviour: propagate only if *all* | ||
/// predecessor blocks's value is `!BOTTOM_VALUE`. | ||
/// E.g. if you want to know whether a bit is *definitely* set at a specific location. This | ||
/// means that all code paths leading to the location must have set the bit, instead of any | ||
/// code path leading there. | ||
/// | ||
/// If you want this kind of "definitely set" analysis, you need to | ||
/// 1. Invert `BOTTOM_VALUE` | ||
/// 2. Reset the `entry_set` in `start_block_effect` to `!BOTTOM_VALUE` | ||
/// 3. Override `join` to do the opposite from what it's doing now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment has not been preserved, since I didn't write it and it is very much tied to the existing nomenclature. Perhaps there's a way to distill it into a form that is compatible with the new interface @oli-obk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, this is fine, the new interface allows us to do this properly instead of dancing with booleans.
@bors try |
Awaiting bors try build completion |
⌛ Trying commit e47c22b with merge 710b62932438d91c6d28287e41dc5657797baf8e... |
☀️ Try build successful - checks-actions, checks-azure |
Queued 710b62932438d91c6d28287e41dc5657797baf8e with parent 5c27700, future comparison URL. |
Finished benchmarking try commit (710b62932438d91c6d28287e41dc5657797baf8e): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Mixed <1% improvements and regressions. Likely noise. |
A few small cleanups to `BitSet` and friends: - Overload `clone_from` for `BitSet`. - Improve `Debug` represenation of `HybridBitSet`. - Make `HybridBitSet::domain_size` public. - Don't require `T: Idx` at the type level. The `Idx` bound is still on most `BitSet` methods, but like `HashMap`, it doesn't need to be satisfied for the type to exist.
I've tried a few ways of implementing this, but each fell short. Adding an auxiliary `_Idx` associated type to `Analysis` that defaults to `!` but is overridden in the blanket impl of `Analysis` for `A: GenKillAnalysis` to `A::Idx` seems promising, but the trait solver is unable to prove equivalence between `A::Idx` and `A::_Idx` within the overridden version of `into_engine`. Without full-featured specialization, removing `into_engine` or splitting it into a different trait would have a significant ergonomic penalty. Alternatively, we could erase the index type and store a `GenKillSet<u32>` as well as a function pointer for transmuting between `&mut A::Domain` and `&mut BitSet<u32>` in the hopes that LLVM can devirtualize a simple function pointer better than the boxed closure. However, this is brittle, requires `unsafe` code, and doesn't work for index types that aren't the same size as a `u32` (e.g. `usize`) since `GenKillSet` stores a `HybridBitSet`, which may be a `Vec<I>`. Perhaps safe transmute could help here?
e47c22b
to
c03eba2
Compare
I'm fine with the trait object indirection for now. We can do more experimentation in future PRs (from this point on other ppl can do work, idk who but you would have been able to pull this PR off). I went through commit by commit, all the changes seem very reasonable to me. I think we may be able to simplify some lints in clippy now, too. @bors r+ |
📌 Commit b015109 has been approved by |
☀️ Test successful - checks-actions, checks-azure |
Just a heads up, this PR broke Clippy: https://github.com/rust-lang/rust-clippy/blob/master/clippy_lints/src/redundant_clone.rs#L17 I'm happy to do the fix over there, but I would need some pointers as to what the replacement API should be. |
Since the switch to subtrees, all PRs to |
@Daniel-B-Smith like @ecstatic-morse said it's already fixed for Clippy in this repository. |
…i-obk Support dataflow problems on arbitrary lattices This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just `BitSet`s. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g. `size_of::<T>()`) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc `@rust-lang/wg-mir-opt.` The `Idx` associated type is moved from `AnalysisDomain` to `GenKillAnalysis` and replaced with an associated `Domain` type that must implement `JoinSemiLattice`. Like before, each `Analysis` defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use the `lattice::Dual` newtype. `GenKillAnalysis` impls have an additional requirement that `Self::Domain: BorrowMut<BitSet<Self::Idx>>`, which effectively means that they must use `BitSet<Self::Idx>` or `lattice::Dual<BitSet<Self::Idx>>` as their domain. Most of these changes were mechanical. However, because a `Domain` is no longer always a powerset of some index type, we can no longer use an `IndexVec<BasicBlock, GenKillSet<A::Idx>>>` to store cached block transfer functions. Instead, we use a boxed `dyn Fn` trait object. I discuss a few alternatives to the current approach in a commit message. The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of `MovePathIndex`s, which are used in many dataflow analyses. This required a parallel `fmt` trait used only for printing dataflow domains, as well as a refactoring of the `graphviz` module now that we cannot expect the domain to be a `BitSet`. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to a `Visitor` for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses. r? `@pnkfelix`
This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just
BitSet
s. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g.size_of::<T>()
) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc @rust-lang/wg-mir-opt.The
Idx
associated type is moved fromAnalysisDomain
toGenKillAnalysis
and replaced with an associatedDomain
type that must implementJoinSemiLattice
. Like before, eachAnalysis
defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use thelattice::Dual
newtype.GenKillAnalysis
impls have an additional requirement thatSelf::Domain: BorrowMut<BitSet<Self::Idx>>
, which effectively means that they must useBitSet<Self::Idx>
orlattice::Dual<BitSet<Self::Idx>>
as their domain.Most of these changes were mechanical. However, because a
Domain
is no longer always a powerset of some index type, we can no longer use anIndexVec<BasicBlock, GenKillSet<A::Idx>>>
to store cached block transfer functions. Instead, we use a boxeddyn Fn
trait object. I discuss a few alternatives to the current approach in a commit message.The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of
MovePathIndex
s, which are used in many dataflow analyses. This required a parallelfmt
trait used only for printing dataflow domains, as well as a refactoring of thegraphviz
module now that we cannot expect the domain to be aBitSet
. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to aVisitor
for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses.r? @pnkfelix