Skip to content

Ruby: type-tracking and API edges through simple library callables #10375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
Sep 30, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
53ef054
Ruby: Add getACallSimple and use it for arrays and hashes
asgerf Aug 30, 2022
f1b99e8
Ruby: use IPA type for type tracker contents
asgerf Aug 30, 2022
cd9cddf
Ruby: generate type-tracking steps from simple summary specs
asgerf Aug 30, 2022
e104b65
Python: sync TypeTracker.qll and adapt accordingly
asgerf Aug 30, 2022
d5e2b93
Ruby: add API graph label for content
asgerf Aug 30, 2022
a51a540
Ruby: add content edges to API graph
asgerf Aug 30, 2022
a64f7cd
Ruby: simplify getSetterCallAttributeName
asgerf Sep 1, 2022
ac1b7eb
Remove SetterMethodCall in MkAttribute
asgerf Sep 1, 2022
497258e
Ruby: reuse Content type
asgerf Sep 1, 2022
3498a04
Ruby: associate ContentSets with store/load edges in type tracker
asgerf Sep 1, 2022
b13b2ce
Ruby: fix join order when building append relation
asgerf Sep 1, 2022
cbf1657
Ruby: tweak pipeline a bit
asgerf Sep 5, 2022
576e320
Python: sync
asgerf Sep 12, 2022
7737e75
Update some QLDoc comments
asgerf Sep 8, 2022
e47deaf
Ruby: More QLDoc police
asgerf Sep 13, 2022
a5ed3d7
Ruby: expand test case to reveal mismatching forward/backward flow
asgerf Sep 15, 2022
85d0c63
Ruby: store a ContentSet on type tracker instances
asgerf Sep 15, 2022
6abf77d
Factor comparison into compatibleContents
asgerf Sep 15, 2022
dd23e12
Rename TypeTrackerContentSet -> TypeTrackerContent
asgerf Sep 15, 2022
9c93ad9
Python: sync
asgerf Sep 15, 2022
7dfa58b
Remove Content::NoContent
asgerf Sep 15, 2022
a7b9229
Ruby: fix a typo
asgerf Sep 22, 2022
588b31d
Ruby: fix another typo
asgerf Sep 22, 2022
e09a5e8
Ruby: clarify what getAnElement() does
asgerf Sep 22, 2022
032847f
Ruby: inline getContents
asgerf Sep 22, 2022
665ee81
Ruby: revert trackUseNode to idiomatic type-tracking
asgerf Sep 4, 2022
ce3665d
Ruby: remove unneeded qualified AST import
asgerf Sep 22, 2022
14e384a
Ruby: remove unneeded import
asgerf Sep 22, 2022
e1dfed0
Ruby: move OptionalContentSet to TypeTrackerSpecific.qll
asgerf Sep 22, 2022
e56630a
Ruby: add missing qldoc
asgerf Sep 26, 2022
ee7dea1
Merge branch 'main' into rb/summarize-loads-v2
asgerf Sep 28, 2022
ce1c258
Ruby: Update TypeTracker.expected
asgerf Sep 28, 2022
9716572
Ruby: update API graph inline test to match output
asgerf Sep 28, 2022
fea47c8
Ruby: expand on type-tracking test a bit
asgerf Sep 28, 2022
8704cce
Ruby: mention TNoContentSet is only used by type-tracking
asgerf Sep 28, 2022
76cab23
Ruby: reuse argumentPositionMatch
asgerf Sep 28, 2022
3af3772
Ruby: Include `With(out)Element` in `isElementBody`
hvitved Sep 28, 2022
dc03557
Merge branch 'main' into rb/summarize-loads-v2
asgerf Sep 29, 2022
f1de5a2
Ruby: Restrict summaries and type trackers to relevant contents
asgerf Sep 28, 2022
ae60b0a
Ruby: ensure pruning works with startInContent
asgerf Sep 29, 2022
ed36f19
Python: sync TypeTracker.qll
asgerf Sep 29, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions python/ql/lib/semmle/python/dataflow/new/TypeTracker.qll
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,13 @@

private import python
private import internal.TypeTracker as Internal
private import internal.TypeTrackerSpecific as InternalSpecific

/** A string that may appear as the name of an attribute or access path. */
class AttributeName = Internal::ContentName;
class AttributeName = InternalSpecific::TypeTrackerContent;

/** An attribute name, or the empty string (representing no attribute). */
class OptionalAttributeName = Internal::OptionalContentName;
class OptionalAttributeName = InternalSpecific::OptionalTypeTrackerContent;

/**
* The summary of the steps needed to track a value to a given dataflow node.
Expand Down
153 changes: 96 additions & 57 deletions python/ql/lib/semmle/python/dataflow/new/internal/TypeTracker.qll
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,6 @@

private import TypeTrackerSpecific

/**
* A string that may appear as the name of a piece of content. This will usually include things like:
* - Attribute names (in Python)
* - Property names (in JavaScript)
*
* In general, this can also be used to model things like stores to specific list indices. To ensure
* correctness, it is important that
*
* - different types of content do not have overlapping names, and
* - the empty string `""` is not a valid piece of content, as it is used to indicate the absence of
* content instead.
*/
class ContentName extends string {
ContentName() { this = getPossibleContentName() }
}

/** A content name, or the empty string (representing no content). */
class OptionalContentName extends string {
OptionalContentName() { this instanceof ContentName or this = "" }
}

cached
private module Cached {
/**
Expand All @@ -33,48 +12,78 @@ private module Cached {
LevelStep() or
CallStep() or
ReturnStep() or
StoreStep(ContentName content) or
LoadStep(ContentName content) or
StoreStep(TypeTrackerContent content) { basicStoreStep(_, _, content) } or
LoadStep(TypeTrackerContent content) { basicLoadStep(_, _, content) } or
JumpStep()

pragma[nomagic]
private TypeTracker noContentTypeTracker(boolean hasCall) {
result = MkTypeTracker(hasCall, noContent())
}

/** Gets the summary resulting from appending `step` to type-tracking summary `tt`. */
cached
TypeTracker append(TypeTracker tt, StepSummary step) {
exists(Boolean hasCall, OptionalContentName content | tt = MkTypeTracker(hasCall, content) |
exists(Boolean hasCall, OptionalTypeTrackerContent currentContents |
tt = MkTypeTracker(hasCall, currentContents)
|
step = LevelStep() and result = tt
or
step = CallStep() and result = MkTypeTracker(true, content)
step = CallStep() and result = MkTypeTracker(true, currentContents)
or
step = ReturnStep() and hasCall = false and result = tt
or
step = LoadStep(content) and result = MkTypeTracker(hasCall, "")
or
exists(string p | step = StoreStep(p) and content = "" and result = MkTypeTracker(hasCall, p))
or
step = JumpStep() and
result = MkTypeTracker(false, content)
result = MkTypeTracker(false, currentContents)
)
or
exists(TypeTrackerContent storeContents, boolean hasCall |
exists(TypeTrackerContent loadContents |
step = LoadStep(pragma[only_bind_into](loadContents)) and
tt = MkTypeTracker(hasCall, storeContents) and
compatibleContents(storeContents, loadContents) and
result = noContentTypeTracker(hasCall)
)
or
step = StoreStep(pragma[only_bind_into](storeContents)) and
tt = noContentTypeTracker(hasCall) and
result = MkTypeTracker(hasCall, storeContents)
)
}

pragma[nomagic]
private TypeBackTracker noContentTypeBackTracker(boolean hasReturn) {
result = MkTypeBackTracker(hasReturn, noContent())
}

/** Gets the summary resulting from prepending `step` to this type-tracking summary. */
cached
TypeBackTracker prepend(TypeBackTracker tbt, StepSummary step) {
exists(Boolean hasReturn, string content | tbt = MkTypeBackTracker(hasReturn, content) |
exists(Boolean hasReturn, OptionalTypeTrackerContent content |
tbt = MkTypeBackTracker(hasReturn, content)
|
step = LevelStep() and result = tbt
or
step = CallStep() and hasReturn = false and result = tbt
or
step = ReturnStep() and result = MkTypeBackTracker(true, content)
or
exists(string p |
step = LoadStep(p) and content = "" and result = MkTypeBackTracker(hasReturn, p)
)
or
step = StoreStep(content) and result = MkTypeBackTracker(hasReturn, "")
or
step = JumpStep() and
result = MkTypeBackTracker(false, content)
)
or
exists(TypeTrackerContent loadContents, boolean hasReturn |
exists(TypeTrackerContent storeContents |
step = StoreStep(pragma[only_bind_into](storeContents)) and
tbt = MkTypeBackTracker(hasReturn, loadContents) and
compatibleContents(storeContents, loadContents) and
result = noContentTypeBackTracker(hasReturn)
)
or
step = LoadStep(pragma[only_bind_into](loadContents)) and
tbt = noContentTypeBackTracker(hasReturn) and
result = MkTypeBackTracker(hasReturn, loadContents)
)
}

/**
Expand Down Expand Up @@ -114,9 +123,9 @@ class StepSummary extends TStepSummary {
or
this instanceof ReturnStep and result = "return"
or
exists(string content | this = StoreStep(content) | result = "store " + content)
exists(TypeTrackerContent content | this = StoreStep(content) | result = "store " + content)
or
exists(string content | this = LoadStep(content) | result = "load " + content)
exists(TypeTrackerContent content | this = LoadStep(content) | result = "load " + content)
or
this instanceof JumpStep and result = "jump"
}
Expand All @@ -130,7 +139,7 @@ private predicate smallstepNoCall(Node nodeFrom, TypeTrackingNode nodeTo, StepSu
levelStep(nodeFrom, nodeTo) and
summary = LevelStep()
or
exists(string content |
exists(TypeTrackerContent content |
StepSummary::localSourceStoreStep(nodeFrom, nodeTo, content) and
summary = StoreStep(content)
or
Expand Down Expand Up @@ -180,7 +189,7 @@ module StepSummary {
}

/**
* Holds if `nodeFrom` is being written to the `content` content of the object in `nodeTo`.
* Holds if `nodeFrom` is being written to the `content` of the object in `nodeTo`.
*
* Note that `nodeTo` will always be a local source node that flows to the place where the content
* is written in `basicStoreStep`. This may lead to the flow of information going "back in time"
Expand All @@ -204,12 +213,23 @@ module StepSummary {
* function. This means we will track the fact that `x.attr` can have the type of `y` into the
* assignment to `z` inside `bar`, even though this attribute write happens _after_ `bar` is called.
*/
predicate localSourceStoreStep(Node nodeFrom, TypeTrackingNode nodeTo, string content) {
predicate localSourceStoreStep(Node nodeFrom, TypeTrackingNode nodeTo, TypeTrackerContent content) {
exists(Node obj | nodeTo.flowsTo(obj) and basicStoreStep(nodeFrom, obj, content))
}
}

private newtype TTypeTracker = MkTypeTracker(Boolean hasCall, OptionalContentName content)
private newtype TTypeTracker =
MkTypeTracker(Boolean hasCall, OptionalTypeTrackerContent content) {
content = noContent()
or
// Restrict `content` to those that might eventually match a load.
// We can't rely on `basicStoreStep` since `startInContent` might be used with
// a content that has no corresponding store.
exists(TypeTrackerContent loadContents |
basicLoadStep(_, _, loadContents) and
compatibleContents(content, loadContents)
)
}

/**
* A summary of the steps needed to track a value to a given dataflow node.
Expand Down Expand Up @@ -240,7 +260,7 @@ private newtype TTypeTracker = MkTypeTracker(Boolean hasCall, OptionalContentNam
*/
class TypeTracker extends TTypeTracker {
Boolean hasCall;
OptionalContentName content;
OptionalTypeTrackerContent content;

TypeTracker() { this = MkTypeTracker(hasCall, content) }

Expand All @@ -251,32 +271,38 @@ class TypeTracker extends TTypeTracker {
string toString() {
exists(string withCall, string withContent |
(if hasCall = true then withCall = "with" else withCall = "without") and
(if content != "" then withContent = " with content " + content else withContent = "") and
(
if content != noContent()
then withContent = " with content " + content
else withContent = ""
) and
result = "type tracker " + withCall + " call steps" + withContent
)
}

/**
* Holds if this is the starting point of type tracking.
*/
predicate start() { hasCall = false and content = "" }
predicate start() { hasCall = false and content = noContent() }

/**
* Holds if this is the starting point of type tracking, and the value starts in the content named `contentName`.
* The type tracking only ends after the content has been loaded.
*/
predicate startInContent(ContentName contentName) { hasCall = false and content = contentName }
predicate startInContent(TypeTrackerContent contentName) {
hasCall = false and content = contentName
}

/**
* Holds if this is the starting point of type tracking
* when tracking a parameter into a call, but not out of it.
*/
predicate call() { hasCall = true and content = "" }
predicate call() { hasCall = true and content = noContent() }

/**
* Holds if this is the end point of type tracking.
*/
predicate end() { content = "" }
predicate end() { content = noContent() }

/**
* INTERNAL. DO NOT USE.
Expand All @@ -290,15 +316,15 @@ class TypeTracker extends TTypeTracker {
*
* Gets the content associated with this type tracker.
*/
string getContent() { result = content }
OptionalTypeTrackerContent getContent() { result = content }

/**
* Gets a type tracker that starts where this one has left off to allow continued
* tracking.
*
* This predicate is only defined if the type is not associated to a piece of content.
*/
TypeTracker continue() { content = "" and result = this }
TypeTracker continue() { content = noContent() and result = this }

/**
* Gets the summary that corresponds to having taken a forwards
Expand Down Expand Up @@ -356,7 +382,16 @@ module TypeTracker {
TypeTracker end() { result.end() }
}

private newtype TTypeBackTracker = MkTypeBackTracker(Boolean hasReturn, OptionalContentName content)
private newtype TTypeBackTracker =
MkTypeBackTracker(Boolean hasReturn, OptionalTypeTrackerContent content) {
content = noContent()
or
// As in MkTypeTracker, restrict `content` to those that might eventually match a store.
exists(TypeTrackerContent storeContent |
basicStoreStep(_, _, storeContent) and
compatibleContents(storeContent, content)
)
}

/**
* A summary of the steps needed to back-track a use of a value to a given dataflow node.
Expand Down Expand Up @@ -390,7 +425,7 @@ private newtype TTypeBackTracker = MkTypeBackTracker(Boolean hasReturn, Optional
*/
class TypeBackTracker extends TTypeBackTracker {
Boolean hasReturn;
string content;
OptionalTypeTrackerContent content;

TypeBackTracker() { this = MkTypeBackTracker(hasReturn, content) }

Expand All @@ -401,20 +436,24 @@ class TypeBackTracker extends TTypeBackTracker {
string toString() {
exists(string withReturn, string withContent |
(if hasReturn = true then withReturn = "with" else withReturn = "without") and
(if content != "" then withContent = " with content " + content else withContent = "") and
(
if content != noContent()
then withContent = " with content " + content
else withContent = ""
) and
result = "type back-tracker " + withReturn + " return steps" + withContent
)
}

/**
* Holds if this is the starting point of type tracking.
*/
predicate start() { hasReturn = false and content = "" }
predicate start() { hasReturn = false and content = noContent() }

/**
* Holds if this is the end point of type tracking.
*/
predicate end() { content = "" }
predicate end() { content = noContent() }

/**
* INTERNAL. DO NOT USE.
Expand All @@ -429,7 +468,7 @@ class TypeBackTracker extends TTypeBackTracker {
*
* This predicate is only defined if the type has not been tracked into a piece of content.
*/
TypeBackTracker continue() { content = "" and result = this }
TypeBackTracker continue() { content = noContent() and result = this }

/**
* Gets the summary that corresponds to having taken a backwards
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,28 @@ class Node = DataFlowPublic::Node;

class TypeTrackingNode = DataFlowPublic::TypeTrackingNode;

/** A content name for use by type trackers, or the empty string. */
class OptionalTypeTrackerContent extends string {
OptionalTypeTrackerContent() {
this = ""
or
this = getPossibleContentName()
}
}

/** A content name for use by type trackers. */
class TypeTrackerContent extends OptionalTypeTrackerContent {
TypeTrackerContent() { this != "" }
}

/** Gets the content string representing no value. */
OptionalTypeTrackerContent noContent() { result = "" }

pragma[inline]
predicate compatibleContents(TypeTrackerContent storeContent, TypeTrackerContent loadContent) {
storeContent = loadContent
}

predicate simpleLocalFlowStep = DataFlowPrivate::simpleLocalFlowStepForTypetracking/2;

predicate jumpStep = DataFlowPrivate::jumpStepSharedWithTypeTracker/2;
Expand Down
Loading