Skip to content

JS: Refactor the XSS / Client-side-url queries #8304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 17, 2022

Conversation

erik-krogh
Copy link
Contributor

@erik-krogh erik-krogh commented Mar 1, 2022

This PR replaces #6632

See this comment by Asger for what this PR tries to fix.

This is a deep fix where flow-labels are used to hopefully correctly model how url sinks can cause XSS.

There are now 3 flow labels in the XSS query:

  • TaintedUrlSuffix: a URL where the attacker only controls a suffix.
  • Taint: a tainted value where the attacker controls part of the value.
  • PrefixLabel: a tainted value where the attacker controls the prefix

All tests pass on every commit (the expected files are updated on each commit).
This should help in doing a commit-by-commit review.


An evaluation looks good (I think).
I like the new alerts, and I don't miss the alerts that got removed.
(I didn't triage the xss-through-dom results).

(Sidenote: If you think a user-controlled src=".." attribute on an <img/> tag is safe, then checkout SVG parsing in browsers).

There is a small performance penalty, but I think we can live with that.

@github-actions github-actions bot added the JS label Mar 1, 2022
@erik-krogh erik-krogh marked this pull request as ready for review March 2, 2022 15:51
@erik-krogh erik-krogh requested a review from a team as a code owner March 2, 2022 15:52
@erik-krogh erik-krogh added the no-change-note-required This PR does not need a change note label Mar 2, 2022
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I'm very happy to see progress on this issue.

This result seems to be a FP due to not recognizing the call url.startsWith('https://3p.ampproject.net/') as a sanitizer guard.

The URL sinks could be further divided into cases where

  • tainted scheme leads to XSS (<a href=...>). Use the Prefix label.
  • tainted hostname leads to XSS (<script src=...>). Currently no flow label for this.

We could add another flow label for the hostname URL sinks, and use sanitizers from UrlConcatenation for that label. Would be interesting to see how much it affects performance.

abstract class Sink extends DataFlow::Node { }
abstract class Sink extends DataFlow::Node {
/** Holds if the sink can execute JavaScript code in the current context. */
predicate isXSSSink() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
predicate isXSSSink() {
predicate isXssSink() {

Acronyms should be PascalCase/camelCase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ql-for-ql: four upper-case characters in a row are suspicious...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ql-for-ql: four upper-case characters in a row are suspicious...)

That rabbit hole was deep.

ClientSideRemoteFlowSource source() {
result.getKind().isFragment()
or
result.getKind().isQuery()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my impression that location.search and location.hash are the only sources that actually include the ? or # character. Sources from library models, like Angular, tend not to have this leading character in it.

We could make the ClientSideRemoteFlowSource model richer, or just use locationRef().getAPropertyRead(["search", "hash"]).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we need to include all sources of kind url.
In the QLDoc for ClientSideRemoteFlowKind::isUrl is states: the untrusted part of the URL is prefixed by trusted data.

/**
* A sanitizer that blocks the `PrefixString` label when the start of the string is being tested as being of a particular prefix.
*/
class PrefixStringSanitizer extends SanitizerGuard instanceof StringOps::StartsWith {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to put this sanitizer guard in the Query.qll file instead? There's this annoying problem where the SanitizerGuard#blocks override will affect other sanitizer guards based on StringOps::StartsWith, even if PrefixStringSanitizer wasn't explicitly mentioned in isSanitizerGuard.

Also, could this extend LabeledSanitizerGuardNode?

)
or
// we assume that `.join()` calls have a prefix, and thus block the prefix label.
node = any(DataFlow::MethodCallNode call | call.getMethodName() = "join") and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

join calls are also modelled as string concatenations. Is the above case not enough?

Ideally [taint, "constant"].join() would remain prefix-tainted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

join calls are also modelled as string concatenations. Is the above case not enough?

Yes join calls are modeled as string concats, but only when join is called with the empty string.
So e.g. [taint, "constant"].join("/") is not modeled as a string concatenation.

So no, the above case is not enough.

@erik-krogh
Copy link
Contributor Author

This result seems to be a FP due to not recognizing the call url.startsWith('https://3p.ampproject.net/') as a sanitizer guard.

It's actually from url being a global variable, and that causes us to not recognize the use within the if as sanitized.
The sanitizer works if you place the entire thing inside a function.

@erik-krogh
Copy link
Contributor Author

The URL sinks could be further divided into cases where

If possible I would prefer waiting with that.
Your suggested change requires changing HostnameSanitizerGuard so that it's somehow configurable which labels should be sanitized (possibly make the class abstract and instantiate the class in all the queries that use it).

@asgerf
Copy link
Contributor

asgerf commented Mar 15, 2022

It's actually from url being a global variable, and that causes us to not recognize the use within the if as sanitized.
The sanitizer works if you place the entire thing inside a function.

Ah, in that case I'm OK with going ahead with this.

Some tests are currently red, but otherwise LGTM.

@erik-krogh
Copy link
Contributor Author

The latest CI failure was due to a semantic merge conflict with #8323

I rebased on main and updated updated a name-reference to a non-deprecated name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JS no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants