-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Ruby: more type-tracking steps #10650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3c799e7
to
d1d2fae
Compare
cc09b73
to
31378ec
Compare
fixup to LoadStore
fixup ContentFilter fixup basicWith(out)contentstep
Ruby: fixup type-tracking hash flow test Fixup! type-tracking hash flow test result
fixup docs fixup docs fixup TypeTrackingStep
4a20227
to
9485940
Compare
Rebased to resolve conflict with #10664. Started another evaluation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review comments. We should be able to revert the Hash changes once #10686 is in.
Thanks for the review! I'll work on getting type-tracking to rely on the |
New evaluation looks... unexpectedly good. We now gain 21,000 call edges at a very modest cost. Previously we only gained 2k call edges. I'll take some time to investigate exactly how the recent changes caused this change. |
You may want to remove this from the description now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this LGTM. Some comments.
@@ -381,7 +381,7 @@ private module Cached { | |||
n instanceof SynthReturnNode | |||
or | |||
// Needed for stores in type tracking | |||
TypeTrackerSpecific::postUpdateStoreStep(_, n, _) | |||
TypeTrackerSpecific::storeStepIntoSourceNode(_, n, _) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I belive we should remove
not n instanceof SynthHashSplatParameterNode
above as well.
@@ -1,6 +1,7 @@ | |||
/** Provides flow summaries for the `Hash` class. */ | |||
|
|||
private import codeql.ruby.AST | |||
private import codeql.ruby.CFG as Cfg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this import.
* of the standard library, and their implementations may not depend on API graphs. | ||
* For query-specific steps, consider including the custom steps in the type-tracking predicate itself. | ||
*/ | ||
class TypeTrackingStep extends TUnit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced that we should add this, especially as it is not used presently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine, let's remove it for now
nodeFrom = evaluateSummaryComponentLocal(call, input) and | ||
nodeTo = evaluateSummaryComponentLocal(call, output) | ||
nodeFrom = evaluateSummaryComponentStackLocal(callable, call, input) and | ||
nodeTo = evaluateSummaryComponentStackLocal(callable, call, output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a store step induced from a flow summary
input = Argument[0] and
output = Argument[1].Element[?]
and a call to the relevant method
foo(x, y)
doesn't this yield a store-step x -store-> y
, while it should have been x -store-> [post] y
?
The same for read-store steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type tracking isn't supposed to be flow sensitive. The intent is for the store to ultimately target the local sources of y
and flow from there to its uses.
There's some pre-existing code here about post-updates, which I don't really agree with, but for this PR I'm trying not to change too many things at once.
} | ||
|
||
pragma[nomagic] | ||
private predicate hasWithoutContentSummary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with compound WithoutContent
s (see #10691)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If at most one of those WithoutContent
has an associated ContentFilter
then yes, because the rest are ignored and treated as ordinary flow (in evaluateSummaryComponentStackLocal
).
) | ||
} | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy-paste error (mentions WithoutContent
.
} | ||
|
||
pragma[nomagic] | ||
private predicate hasWithContentSummary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as for hasWithoutContentSummary
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, doesn't handle compound WithContent
.
filter = getFilterFromWithoutContentStep(content) and | ||
not isNonLocal(input.head()) and | ||
not isNonLocal(output.head()) and | ||
input != output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This restriction is super important, as otherwise use-use flow will ignore it (that's what prohibitsUseUseFlow
is there for in data flow). Perhaps highlight that with a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type-tracking doesn't depend on prohibitsUseUseFlow
. It's simply here to clean up some steps that would have no effect for type-tracking.
Expands on the work in #10375 to add more type-tracking steps.
In particular, we are now able to generate type-tracking steps in more cases:
WithContent
andWithoutContent
are now supported with precision that's more appropriate for type tracking.There are a couple of performance tweaks along the way, resulting in mostly neutral performance overall.
I found the easiest way to leverage the existing test suite was to add inline test that rely on type-tracking, and simply let the test output document the different between DataFlow::Configuration flow and type-tracking flow.
Evaluation shows mostly neutral performance and around 21,000 new call edges.
I triaged some of the changes in an earlier evaluation, and the only spurious call edge I could find was the call to delete! which I believe is ultimately because a
yield
call resolved to a callback coming from a different caller. (In JS, plain callbacks are not part of the static call graph and are handled specially in type-tracking - we could do the same in Ruby but it doesn't seem urgent.)There’s still some work to be done, but given the performance-related commits in here, I’d prefer to merge sooner rather than later, so we don’t end up redoing or undoing each other’s performance work.
Edit: updated text to reflect final state of PR