Ruby: more type-tracking steps #10650

asgerf · 2022-09-30T20:39:38Z

Expands on the work in #10375 to add more type-tracking steps.

In particular, we are now able to generate type-tracking steps in more cases:

The input and/or output can now be mapped to a parameter or return of a callback/block.
WithContent and WithoutContent are now supported with precision that's more appropriate for type tracking.
When both input and output end with a content, we can now map it to a load-store step.

There are a couple of performance tweaks along the way, resulting in mostly neutral performance overall.

I found the easiest way to leverage the existing test suite was to add inline test that rely on type-tracking, and simply let the test output document the different between DataFlow::Configuration flow and type-tracking flow.

Evaluation shows mostly neutral performance and around 21,000 new call edges.

I triaged some of the changes in an earlier evaluation, and the only spurious call edge I could find was the call to delete! which I believe is ultimately because a yield call resolved to a callback coming from a different caller. (In JS, plain callbacks are not part of the static call graph and are handled specially in type-tracking - we could do the same in Ruby but it doesn't seem urgent.)

There’s still some work to be done, but given the performance-related commits in here, I’d prefer to merge sooner rather than later, so we don’t end up redoing or undoing each other’s performance work.

Edit: updated text to reflect final state of PR

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll

fixup to LoadStore

fixup ContentFilter fixup basicWith(out)contentstep

Ruby: fixup type-tracking hash flow test Fixup! type-tracking hash flow test result

fixup docs fixup docs fixup TypeTrackingStep

asgerf · 2022-10-04T09:21:51Z

Rebased to resolve conflict with #10664. Started another evaluation.

asgerf · 2022-10-04T11:56:15Z

New evaluation

hvitved

Initial review comments. We should be able to revert the Hash changes once #10686 is in.

ruby/ql/lib/codeql/ruby/dataflow/internal/DataFlowDispatch.qll

ruby/ql/lib/codeql/ruby/typetracking/TypeTracker.qll

ruby/ql/lib/codeql/ruby/frameworks/core/Hash.qll

ruby/ql/lib/codeql/ruby/typetracking/TypeTracker.qll

asgerf · 2022-10-05T07:45:16Z

Thanks for the review! I'll work on getting type-tracking to rely on the Argument[hash-splat]-based summary instead, to replace the hand-written steps for Hash.[].

asgerf · 2022-10-05T11:23:46Z

New evaluation looks... unexpectedly good. We now gain 21,000 call edges at a very modest cost. Previously we only gained 2k call edges. I'll take some time to investigate exactly how the recent changes caused this change.

hvitved · 2022-10-05T11:28:24Z

I had to specialise handling of the Hash.[] summary as it would otherwise generate hundreds of useless edges for each element in a hash literal (the number depends on the number of constants in the program). For example, on one project a literal { foo: bar } would map to 648 load-store steps from the foo: bar pair into the hash literal itself. The specialised version simply adds a store step directly from bar to the hash literal. I’ve added hooks so the specialisation happens in the model itself, not as a special case in the type-tracking library.

You may want to remove this from the description now.

hvitved

Overall this LGTM. Some comments.

hvitved · 2022-10-05T11:21:41Z

ruby/ql/lib/codeql/ruby/dataflow/internal/DataFlowPrivate.qll

@@ -381,7 +381,7 @@ private module Cached {
    n instanceof SynthReturnNode
    or
    // Needed for stores in type tracking
-    TypeTrackerSpecific::postUpdateStoreStep(_, n, _)
+    TypeTrackerSpecific::storeStepIntoSourceNode(_, n, _)


I belive we should remove

not n instanceof SynthHashSplatParameterNode

above as well.

hvitved · 2022-10-05T11:22:13Z

ruby/ql/lib/codeql/ruby/frameworks/core/Hash.qll

@@ -1,6 +1,7 @@
 /** Provides flow summaries for the `Hash` class. */

 private import codeql.ruby.AST
+private import codeql.ruby.CFG as Cfg


Remove this import.

hvitved · 2022-10-05T11:32:47Z

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll

+ * of the standard library, and their implementations may not depend on API graphs.
+ * For query-specific steps, consider including the custom steps in the type-tracking predicate itself.
+ */
+class TypeTrackingStep extends TUnit {


I am not convinced that we should add this, especially as it is not used presently.

Fine, let's remove it for now

hvitved · 2022-10-05T11:43:19Z

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll

-    nodeFrom = evaluateSummaryComponentLocal(call, input) and
-    nodeTo = evaluateSummaryComponentLocal(call, output)
+    nodeFrom = evaluateSummaryComponentStackLocal(callable, call, input) and
+    nodeTo = evaluateSummaryComponentStackLocal(callable, call, output)


For a store step induced from a flow summary

input = Argument[0] and output = Argument[1].Element[?]

and a call to the relevant method

foo(x, y)

doesn't this yield a store-step x -store-> y, while it should have been x -store-> [post] y?

The same for read-store steps.

Type tracking isn't supposed to be flow sensitive. The intent is for the store to ultimately target the local sources of y and flow from there to its uses.

There's some pre-existing code here about post-updates, which I don't really agree with, but for this PR I'm trying not to change too many things at once.

hvitved · 2022-10-05T11:45:57Z

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll

+}
+
+pragma[nomagic]
+private predicate hasWithoutContentSummary(


Does this work with compound WithoutContents (see #10691)?

If at most one of those WithoutContent has an associated ContentFilter then yes, because the rest are ignored and treated as ordinary flow (in evaluateSummaryComponentStackLocal).

hvitved · 2022-10-05T11:51:06Z

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll

+  )
+}
+
+/**


Copy-paste error (mentions WithoutContent.

hvitved · 2022-10-05T11:52:27Z

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll

+}
+
+pragma[nomagic]
+private predicate hasWithContentSummary(


Same question as for hasWithoutContentSummary.

No, doesn't handle compound WithContent.

hvitved · 2022-10-05T11:53:30Z

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll

+    filter = getFilterFromWithoutContentStep(content) and
+    not isNonLocal(input.head()) and
+    not isNonLocal(output.head()) and
+    input != output


This restriction is super important, as otherwise use-use flow will ignore it (that's what prohibitsUseUseFlow is there for in data flow). Perhaps highlight that with a comment.

Type-tracking doesn't depend on prohibitsUseUseFlow. It's simply here to clean up some steps that would have no effect for type-tracking.

asgerf · 2022-10-05T17:13:12Z

It seems the large number of extra call edges is a result of merging with @aibaars's PR #10559. The added summaries can now be used by type-tracking to great effect. 💪

github-actions bot added the Ruby label Sep 30, 2022

github-advanced-security bot found potential problems Sep 30, 2022

View reviewed changes

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll Fixed Show fixed Hide fixed

asgerf force-pushed the rb/summarize-more branch 3 times, most recently from 3c799e7 to d1d2fae Compare October 3, 2022 10:40

github-advanced-security bot found potential problems Oct 3, 2022

View reviewed changes

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll Fixed Show fixed Hide fixed

ruby/ql/lib/codeql/ruby/typetracking/TypeTrackerSpecific.qll Fixed Show fixed Hide fixed

asgerf force-pushed the rb/summarize-more branch 3 times, most recently from cc09b73 to 31378ec Compare October 3, 2022 12:25

github-actions bot added the Python label Oct 3, 2022

asgerf added the no-change-note-required This PR does not need a change note label Oct 3, 2022

asgerf marked this pull request as ready for review October 3, 2022 14:25

asgerf requested review from a team as code owners October 3, 2022 14:25

asgerf added 16 commits October 4, 2022 11:06

Ruby: strip trailing whitespace in calls.rb test

ab672de

Ruby: add some calls to .each in call graph test

1c484d8

Ruby: Summarize level steps in type tracking

a4d4e40

Ruby: Summarize load-store steps in type-tracking

0000a7d

fixup to LoadStore

Ruby: Evaluate longer summary component stacks

fbab0f5

Ruby: make Array.each a simple summary

5b2d8b0

Ruby: update test

f75f27d

Ruby: update benign test updates

c06743a

Ruby: use getACallSimple in more Array methods

74c3886

Ruby: use getACallSimple in more Hash methods

8b389fe

Ruby: go to local source in load-store steps

8c43ab6

Ruby: Improve join order when generating edges

a7d764d

Ruby: Speed up evaluateSummaryComponentStackLocal

323abf4

Ruby: support WithoutContent steps in restricted cases

bd11946

fixup ContentFilter fixup basicWith(out)contentstep

Ruby: Hack special-casing of hash literals

9302271

Ruby: add type-tracking variant of hash-flow test

00e52ad

Ruby: fixup type-tracking hash flow test Fixup! type-tracking hash flow test result

asgerf added 6 commits October 4, 2022 11:14

Ruby: improve join order in trackInstanceRec

96711b2

Ruby: add hook for adding type-tracking steps

94d41b9

fixup docs fixup docs fixup TypeTrackingStep

Ruby: move special treatment of Hash.[] into Hash.qll

3ccc3a2

Ruby: do not treat WithoutElement[0..!] as a type filter

b6231e8

Python: sync

28f4dff

Ruby: share type-tracking test with array test

9485940

asgerf force-pushed the rb/summarize-more branch from 4a20227 to 9485940 Compare October 4, 2022 09:15

hvitved reviewed Oct 4, 2022

View reviewed changes

asgerf added 5 commits October 5, 2022 09:32

Ruby: make getAStaticHashCall private again

4c19d2d

Ruby: nomagic on unary hasAdjacentTypeCheckedReads

a9a99c5

Ruby: make flowsToLoadStoreStep private

f5f351e

Ruby: fix content restriction in type trackers

93e8434

Merge branch 'main' into rb/summarize-more

8b7ec20

asgerf added 4 commits October 5, 2022 09:55

Merge branch 'main' into rb/summarize-more

6f74a52

Ruby: remove mention of PairValueContent

7cf969f

Ruby: ensure Hash flow works again

f664a77

Python: sync

ab6e488

hvitved reviewed Oct 5, 2022

View reviewed changes

Ruby: address review comments

c9c3698

hvitved previously approved these changes Oct 5, 2022

View reviewed changes

Ruby: update type tracking test

decd4c9

asgerf dismissed hvitved’s stale review via decd4c9 October 5, 2022 13:15

hvitved approved these changes Oct 5, 2022

View reviewed changes

asgerf merged commit 387e575 into github:main Oct 5, 2022

+                )
+              }
+              /**

Ruby: more type-tracking steps #10650

Ruby: more type-tracking steps #10650

Uh oh!

Conversation

asgerf commented Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asgerf commented Oct 4, 2022

Uh oh!

asgerf commented Oct 4, 2022

Uh oh!

hvitved left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asgerf commented Oct 5, 2022

Uh oh!

asgerf commented Oct 5, 2022

Uh oh!

hvitved commented Oct 5, 2022

Uh oh!

hvitved left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asgerf commented Oct 5, 2022

Uh oh!

Uh oh!

asgerf commented Sep 30, 2022 •

edited

Loading