ATM: undo unsound performance optimizations #8470

esbena · 2022-03-16T21:22:16Z

For an ordinary path-problem query, it is a requirement that at least one sink exists, otherwise there is nothing to alert on.
Thus the optimization with checking isSink(_, this, _) pruning in Configuration::hasFlow is sound.

For programs without any relevant sinks (which is likely to be the case when ATM is used), the Configuration::hasFlow calls will not hold due to the above pruning step.
The optimizations removed in this commit are thus unsound for programs without any relevant sinks.

Relevant pings: @henrymercer @adityasharad

Hopefully performance works out without this 🤞🏻🤞🏻.

Impact

This means that it now is possible to predict sinks in programs without any known sinks(!!!).

Testing

Testing this is presumably tricky at the moment due to the required presence of a mlmodel. So I include my own manual test case for completeness and posterity.

Example database sources

// danger_sink_call.js
danger(sink)

// flow_to_predicted_sink.ts
// carefully selected file content that presently predicts `req.params.id` to be a sink
import { Request } from "express";

module.exports = function retrieveBasket() {
  return (req: Request) => {
    models.Basket.findOne({
      where: { id: req.params.id }
    });
  };
};

Example query that produces no/some results without/with this PR.

import DataFlow::PathGraph
import experimental.adaptivethreatmodeling.TaintedPathATM
import experimental.adaptivethreatmodeling.EndpointFeatures
import experimental.adaptivethreatmodeling.EndpointScoring

from DataFlow::Node scored, EndpointType type, string basename, float scoreNum, string description
where
  description = type.getDescription() and
  ModelScoring::endpointScores(scored, type.getEncoding(), scoreNum) and
  scored.getFile().getBaseName() = basename and
  basename = ["danger_sink_call.js", "flow_to_predicted_sink.ts"]
select scored, basename, description, scoreNum

Confirming that `hasFlow(...)` works for the full ATM query without any known sinks

Run TaintedPathATM.ql.

For an ordinary path-problem query, it is a requirement that at least one sink exists, otherwise there is nothing to alert on. Thus the optimization with checking `isSink(_, this, _)` pruning in `Configuration::hasFlow` is sound. For programs without any relevant sinks (which is likely to be the case when ATM is used), the `Configuration::hasFlow` calls will not hold due to the above pruning step. The optimizations removed in this commit are thus unsound for programs without any relevant sinks.

adityasharad · 2022-03-16T22:20:56Z

I think this makes sense. The goal is to discover new sinks, so we cannot restrict the analysis to only consider known sinks.

esbena · 2022-05-23T12:09:54Z

Marking as draft due to many other in-flight ATM changes.

esbena added the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label Mar 16, 2022

esbena requested a review from a team March 16, 2022 21:22

github-actions bot added the JS label Mar 16, 2022

re-optimize without being unsound

e94e1bb

esbena marked this pull request as draft May 23, 2022 12:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ATM: undo unsound performance optimizations #8470

ATM: undo unsound performance optimizations #8470

Uh oh!

esbena commented Mar 16, 2022

Uh oh!

adityasharad commented Mar 16, 2022

Uh oh!

esbena commented May 23, 2022

Uh oh!

Uh oh!

ATM: undo unsound performance optimizations #8470

Are you sure you want to change the base?

ATM: undo unsound performance optimizations #8470

Uh oh!

Conversation

esbena commented Mar 16, 2022

Impact

Testing

Example database sources

Example query that produces no/some results without/with this PR.

Confirming that hasFlow(...) works for the full ATM query without any known sinks

Uh oh!

adityasharad commented Mar 16, 2022

Uh oh!

esbena commented May 23, 2022

Uh oh!

Uh oh!

Confirming that `hasFlow(...)` works for the full ATM query without any known sinks