Skip to content

DataFlow: Add language-specific predicate for ignoring steps in flow-through calculation #14799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Dec 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,6 @@ module CppDataFlow implements InputSig {
Node exprNode(DataFlowExpr e) { result = Public::exprNode(e) }

predicate getAdditionalFlowIntoCallNodeTerm = Private::getAdditionalFlowIntoCallNodeTerm/2;

predicate validParameterAliasStep = Private::validParameterAliasStep/2;
}
Original file line number Diff line number Diff line change
Expand Up @@ -1149,3 +1149,55 @@ private int countNumberOfBranchesUsingParameter(SwitchInstruction switch, Parame
)
)
}

/**
* Holds if the data-flow step from `node1` to `node2` can be used to
* determine where side-effects may return from a callable.
* For C/C++, this means that the step from `node1` to `node2` not only
* preserves the value, but also preserves the identity of the value.
* For example, the assignment to `x` that reads the value of `*p` in
* ```cpp
* int* p = ...
* int x = *p;
* ```
* does not preserve the identity of `*p`.
*/
bindingset[node1, node2]
pragma[inline_late]
predicate validParameterAliasStep(Node node1, Node node2) {
// When flow-through summaries are computed we track which parameters flow to out-going parameters.
// In an example such as:
// ```
// modify(int* px) { *px = source(); }
// void modify_copy(int* p) {
// int x = *p;
// modify(&x);
// }
// ```
// since dataflow tracks each indirection as a separate SSA variable dataflow
// sees the above roughly as
// ```
// modify(int* px, int deref_px) { deref_px = source(); }
// void modify_copy(int* p, int deref_p) {
// int x = deref_p;
// modify(&x, x);
// }
// ```
// and when dataflow computes flow from a parameter to a post-update node to
// conclude which parameters are "updated" by the call to `modify_copy` it
// finds flow from `x [post update]` to `deref_p [post update]`.
// To prevent this we exclude steps that don't preserve identity. We do this
// by excluding flow from the right-hand side of `StoreInstruction`s to the
// `StoreInstruction`. This is sufficient because, for flow-through summaries,
// we're only interested in indirect parameters such as `deref_p` in the
// exampe above (i.e., the parameters with a non-zero indirection index), and
// if that ever flows to the right-hand side of a `StoreInstruction` then
// there must have been a dereference to reduce its indirection index down to
// 0.
not exists(Operand operand |
node1.asOperand() = operand and
node2.asInstruction().(StoreInstruction).getSourceValueOperand() = operand
)
// TODO: Also block flow through models that don't preserve identity such
// as `strdup`.
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ uniquePostUpdate
postIsInSameCallable
reverseRead
argHasPostUpdate
| flowOut.cpp:55:14:55:16 | * ... | ArgumentNode is missing PostUpdateNode. |
| lambdas.cpp:18:7:18:7 | a | ArgumentNode is missing PostUpdateNode. |
| lambdas.cpp:25:2:25:2 | b | ArgumentNode is missing PostUpdateNode. |
| lambdas.cpp:32:2:32:2 | c | ArgumentNode is missing PostUpdateNode. |
Expand Down Expand Up @@ -51,7 +52,18 @@ postWithInFlow
| example.c:28:23:28:25 | pos [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:5:5:5:12 | * ... [post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:5:6:5:12 | toTaint [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:8:5:8:12 | * ... [post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:8:6:8:12 | toTaint [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:18:17:18:17 | x [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:30:12:30:12 | x [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:37:5:37:6 | p2 [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:37:5:37:9 | access to array [post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:84:3:84:7 | call to deref [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:84:3:84:14 | access to array [post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:84:10:84:10 | p [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:90:3:90:4 | * ... [post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:90:4:90:4 | q [inner post update] | PostUpdateNode should not be the target of local flow. |
| flowOut.cpp:101:14:101:14 | p [inner post update] | PostUpdateNode should not be the target of local flow. |
| globals.cpp:13:5:13:19 | flowTestGlobal1 [post update] | PostUpdateNode should not be the target of local flow. |
| globals.cpp:23:5:23:19 | flowTestGlobal2 [post update] | PostUpdateNode should not be the target of local flow. |
| lambdas.cpp:23:3:23:14 | v [post update] | PostUpdateNode should not be the target of local flow. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,15 @@ compatibleTypesReflexive
unreachableNodeCCtx
localCallNodes
postIsNotPre
| flowOut.cpp:84:3:84:14 | access to array indirection | PostUpdateNode should not equal its pre-update node. |
postHasUniquePre
uniquePostUpdate
| example.c:24:13:24:18 | coords indirection | Node has multiple PostUpdateNodes. |
postIsInSameCallable
reverseRead
argHasPostUpdate
postWithInFlow
| flowOut.cpp:84:3:84:14 | access to array indirection | PostUpdateNode should not be the target of local flow. |
| test.cpp:384:10:384:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:391:10:391:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:400:10:400:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
Expand Down
99 changes: 91 additions & 8 deletions cpp/ql/test/library-tests/dataflow/dataflow-tests/flowOut.cpp
Original file line number Diff line number Diff line change
@@ -1,20 +1,103 @@
int source();
void sink(int);
int source(); char source(bool);
void sink(int); void sink(char);

void source_ref(int *toTaint) { // $ ir-def=*toTaint ast-def=toTaint
*toTaint = source();
}



void source_ref(char *toTaint) { // $ ir-def=*toTaint ast-def=toTaint
*toTaint = source();
}
void modify_copy(int* ptr) { // $ ast-def=ptr
int deref = *ptr;
int* other = &deref;
source_ref(other);
}

void test_output() {
void test_output_copy() {
int x = 0;
modify_copy(&x);
sink(x); // $ SPURIOUS: ir
}
sink(x); // clean
}

void modify(int* ptr) { // $ ast-def=ptr
int* deref = ptr;
int* other = &*deref;
source_ref(other);
}

void test_output() {
int x = 0;
modify(&x);
sink(x); // $ ir MISSING: ast
}

void modify_copy_of_pointer(int* p, unsigned len) { // $ ast-def=p
int* p2 = new int[len];
for(unsigned i = 0; i < len; ++i) {
p2[i] = p[i];
}

source_ref(p2);
}

void test_modify_copy_of_pointer() {
int x[10];
modify_copy_of_pointer(x, 10);
sink(x[0]); // $ SPURIOUS: ast // clean
}

void modify_pointer(int* p, unsigned len) { // $ ast-def=p
int** p2 = &p;
for(unsigned i = 0; i < len; ++i) {
*p2[i] = p[i];
}

source_ref(*p2);
}

void test_modify_of_pointer() {
int x[10];
modify_pointer(x, 10);
sink(x[0]); // $ ast,ir
}

char* strdup(const char* p);

void modify_copy_via_strdup(char* p) { // $ ast-def=p
char* p2 = strdup(p);
source_ref(p2);
}

void test_modify_copy_via_strdup(char* p) { // $ ast-def=p
modify_copy_via_strdup(p);
sink(*p); // $ SPURIOUS: ir
}

int* deref(int** p) { // $ ast-def=p
int* q = *p;
return q;
}

void test1() {
int x = 0;
int* p = &x;
deref(&p)[0] = source();
sink(*p); // $ ir MISSING: ast
}


void addtaint1(int* q) { // $ ast-def=q ir-def=*q
*q = source();
}

void addtaint2(int** p) { // $ ast-def=p
int* q = *p;
addtaint1(q);
}

void test2() {
int x = 0;
int* p = &x;
addtaint2(&p);
sink(*p); // $ ir MISSING: ast
}
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
WARNING: Module DataFlow has been deprecated and may be removed in future (has-parameter-flow-out.ql:5,18-61)
failures
testFailures
failures
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ astFlow
| dispatch.cpp:10:37:10:42 | call to source | dispatch.cpp:44:15:44:24 | call to notSource2 |
| dispatch.cpp:37:19:37:24 | call to source | dispatch.cpp:11:38:11:38 | x |
| dispatch.cpp:45:18:45:23 | call to source | dispatch.cpp:11:38:11:38 | x |
| flowOut.cpp:44:7:44:7 | x | flowOut.cpp:46:8:46:11 | access to array |
| flowOut.cpp:59:7:59:7 | x | flowOut.cpp:61:8:61:11 | access to array |
| globals.cpp:5:17:5:22 | call to source | globals.cpp:6:10:6:14 | local |
| lambdas.cpp:8:10:8:15 | call to source | lambdas.cpp:14:3:14:6 | t |
| lambdas.cpp:8:10:8:15 | call to source | lambdas.cpp:18:8:18:8 | call to operator() |
Expand Down Expand Up @@ -170,7 +172,11 @@ irFlow
| dispatch.cpp:107:17:107:22 | call to source | dispatch.cpp:96:8:96:8 | x |
| dispatch.cpp:140:8:140:13 | call to source | dispatch.cpp:96:8:96:8 | x |
| dispatch.cpp:144:8:144:13 | call to source | dispatch.cpp:96:8:96:8 | x |
| flowOut.cpp:5:16:5:21 | call to source | flowOut.cpp:19:9:19:9 | x |
| flowOut.cpp:5:16:5:21 | call to source | flowOut.cpp:31:9:31:9 | x |
| flowOut.cpp:5:16:5:21 | call to source | flowOut.cpp:61:8:61:11 | access to array |
| flowOut.cpp:8:16:8:23 | call to source | flowOut.cpp:73:8:73:9 | * ... |
| flowOut.cpp:84:18:84:23 | call to source | flowOut.cpp:85:8:85:9 | * ... |
| flowOut.cpp:90:8:90:13 | call to source | flowOut.cpp:102:8:102:9 | * ... |
| globals.cpp:5:17:5:22 | call to source | globals.cpp:6:10:6:14 | local |
| globals.cpp:13:23:13:28 | call to source | globals.cpp:12:10:12:24 | flowTestGlobal1 |
| globals.cpp:23:23:23:28 | call to source | globals.cpp:19:10:19:24 | flowTestGlobal2 |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
| flowOut.cpp:44:7:44:7 | x | flowOut.cpp:45:26:45:26 | x |
| flowOut.cpp:44:7:44:7 | x | flowOut.cpp:46:8:46:8 | x |
| flowOut.cpp:59:7:59:7 | x | flowOut.cpp:60:18:60:18 | x |
| flowOut.cpp:59:7:59:7 | x | flowOut.cpp:61:8:61:8 | x |
| ref.cpp:53:9:53:10 | x1 | ref.cpp:55:19:55:20 | x1 |
| ref.cpp:53:9:53:10 | x1 | ref.cpp:56:10:56:11 | x1 |
| ref.cpp:53:13:53:14 | x2 | ref.cpp:58:15:58:16 | x2 |
Expand Down
7 changes: 7 additions & 0 deletions shared/dataflow/codeql/dataflow/DataFlow.qll
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,13 @@ signature module InputSig {

predicate simpleLocalFlowStep(Node node1, Node node2);

/**
* Holds if the data-flow step from `node1` to `node2` can be used to
* determine where side-effects may return from a callable.
*/
bindingset[node1, node2]
default predicate validParameterAliasStep(Node node1, Node node2) { any() }

/**
* Holds if data can flow from `node1` to `node2` through a non-local step
* that does not follow a call edge. For example, a step through a global
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -551,7 +551,8 @@ module MakeImplCommon<InputSig Lang> {
// local flow
exists(Node mid |
parameterValueFlowCand(p, mid, read) and
simpleLocalFlowStep(mid, node)
simpleLocalFlowStep(mid, node) and
validParameterAliasStep(mid, node)
)
or
// read
Expand Down Expand Up @@ -670,7 +671,8 @@ module MakeImplCommon<InputSig Lang> {
// local flow
exists(Node mid |
parameterValueFlow(p, mid, read) and
simpleLocalFlowStep(mid, node)
simpleLocalFlowStep(mid, node) and
validParameterAliasStep(mid, node)
)
or
// read
Expand Down