Skip to content

Ruby: Add YAML unsafe deserialization sinks #13307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 12, 2023
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* Additional sinks for `rb/unsafe-deserialization` have been added. This includes various methods from the `yaml` and `plist` gems, which deserialize YAML and Property List data, respectively.
1 change: 1 addition & 0 deletions ruby/ql/lib/codeql/ruby/Frameworks.qll
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,4 @@ private import codeql.ruby.frameworks.Sinatra
private import codeql.ruby.frameworks.Twirp
private import codeql.ruby.frameworks.Sqlite3
private import codeql.ruby.frameworks.Pg
private import codeql.ruby.frameworks.Yaml
35 changes: 35 additions & 0 deletions ruby/ql/lib/codeql/ruby/frameworks/Yaml.qll
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
/**
* Provides modeling for the `YAML` and `Psych` libraries.
*/

private import codeql.ruby.dataflow.FlowSteps
private import codeql.ruby.DataFlow
private import codeql.ruby.ApiGraphs

/**
* A taint step related to the result of `YAML.parse` calls, or similar.
* In the following example, this step will propagate taint from
* `source` to `sink`:
*
* ```rb
* x = source
* result = YAML.parse(x)
* sink result.to_ruby # Unsafe call
* ```
*/
private class YamlParseStep extends AdditionalTaintStep {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to #12301 (comment)
I didn't write following predicate correctly, I think it should be two separate steps:
1.

 x = source
sink = YAML.parse(x)
sink = source.to_ruby 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that - the second step should only apply when the return value of YAML.parse(x) flows to source, right? We could add to_ruby as a global taint-preserving step but I think that could result in false positives where some other class defines a to_ruby method. I've requested a review from others in the team; we'll see what they say.

Copy link
Contributor

@am0o0 am0o0 Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes second one can have FP,
please consider the following example too, if you like I can try to write an additional step for such a pattern too:
https://github.com/Shopify/krane/blob/0e1a3a0651ca4a0b234722ea98dae74ca8c65351/lib/krane/renderer.rb#L61-L67

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding a separate taint step for the to_rubymethod makes sense. However, let's leave that for a followup pull request in which we can investigate the impact a bit. I would expect to_ruby methods from other libraries to be mostly taint-preserving too, but we might be mistaken.

override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
exists(DataFlow::CallNode yamlParserMethod |
succ = yamlParserMethod.getAMethodCall("to_ruby") and
(
yamlParserMethod = yamlNode().getAMethodCall(["parse", "parse_stream"]) and
pred = [yamlParserMethod.getArgument(0), yamlParserMethod.getKeywordArgument("yaml")]
or
yamlParserMethod = yamlNode().getAMethodCall("parse_file") and
pred = [yamlParserMethod.getArgument(0), yamlParserMethod.getKeywordArgument("filename")]
)
)
}
}

private API::Node yamlNode() { result = API::getTopLevelMember(["YAML", "Psych"]) }
Original file line number Diff line number Diff line change
Expand Up @@ -75,13 +75,34 @@ module UnsafeDeserialization {
}

/**
* An argument in a call to `YAML.load`, considered a sink
* An argument in a call to `YAML.unsafe_*` and `YAML.load_stream` , considered a sink
* for unsafe deserialization. The `YAML` module is an alias of `Psych` in
* recent versions of Ruby.
*/
class YamlLoadArgument extends Sink {
YamlLoadArgument() {
this = API::getTopLevelMember(["YAML", "Psych"]).getAMethodCall("load").getArgument(0)
// Note: this is safe in psych/yaml >= 4.0.0.
this = yamlNode().getAMethodCall("load").getArgument(0)
or
this =
yamlNode().getAMethodCall(["unsafe_load_file", "unsafe_load", "load_stream"]).getArgument(0)
or
this = yamlNode().getAMethodCall(["unsafe_load", "load_stream"]).getKeywordArgument("yaml")
or
this = yamlNode().getAMethodCall("unsafe_load_file").getKeywordArgument("filename")
}
}

private API::Node yamlNode() { result = API::getTopLevelMember(["YAML", "Psych"]) }

/**
* An argument in a call to `YAML.parse*`, considered a sink for unsafe deserialization
* if there is a call to `to_ruby` on the returned value.
*/
class YamlParseArgument extends Sink {
YamlParseArgument() {
this =
yamlNode().getAMethodCall(["parse", "parse_stream", "parse_file"]).getAMethodCall("to_ruby")
}
}

Expand Down Expand Up @@ -208,4 +229,23 @@ module UnsafeDeserialization {
)
}
}

/**
* An argument in a call to `Plist.parse_xml` where `marshal` is `true` (which is
* the default), considered a sink for unsafe deserialization.
*/
class UnsafePlistParsexmlArgument extends Sink {
UnsafePlistParsexmlArgument() {
exists(DataFlow::CallNode plistParseXml |
plistParseXml = API::getTopLevelMember("Plist").getAMethodCall("parse_xml")
|
this = [plistParseXml.getArgument(0), plistParseXml.getKeywordArgument("filename_or_xml")] and
(
plistParseXml.getKeywordArgument("marshal").getConstantValue().isBoolean(true)
or
plistParseXml.getNumberOfArguments() = 1
)
)
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,21 @@ libraries that support it, such as the Ruby standard library's <code>JSON</code>
module, ensure that the parser is configured to disable
deserialization of arbitrary objects.
</p>

<p>
If deserializing an untrusted YAML document using the <code>psych</code> gem,
prefer the <code>safe_load</code> and <code>safe_load_file</code> methods over
<code>load</code> and <code>load_file</code>, as the former will safely
handle untrusted data. Avoid passing untrusted data to the <code>load_stream</code>
method. In <code>psych</code> version 4.0.0 and above, the <code>load</code> method can
safely be used.
</p>

<p>
To safely deserialize <a href="https://en.wikipedia.org/wiki/Property_list">Property List</a>
files using the <code>plist</code> gem, ensure that you pass <code>marshal: false</code>
when calling <code>Plist.parse_xml</code>.
</p>
</recommendation>

<example>
Expand All @@ -27,6 +42,7 @@ on data from an HTTP request. Since these methods are capable of deserializing
to arbitrary objects, this is inherently unsafe.
</p>
<sample src="examples/UnsafeDeserializationBad.rb"/>

<p>
Using <code>JSON.parse</code> and <code>YAML.safe_load</code> instead, as in the
following example, removes the vulnerability. Similarly, calling
Expand Down Expand Up @@ -55,6 +71,10 @@ Ruby documentation: <a href="https://ruby-doc.org/stdlib-3.0.2/libdoc/json/rdoc/
<li>
Ruby documentation: <a href="https://ruby-doc.org/stdlib-3.0.2/libdoc/yaml/rdoc/YAML.html#module-YAML-label-Security">security guidance on the YAML library</a>.
</li>
<li>
You can read that how unsafe yaml load methods can lead to code executions:
<a href="https://devcraft.io/2021/01/07/universal-deserialisation-gadget-for-ruby-2-x-3-x.html">Universal Deserialisation Gadget for Ruby 2.x-3.x </a>.
</li>
</references>

</qhelp>
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
require 'yaml'
class UsersController < ActionController::Base
def example
# not safe
result = Plist.parse_xml(params[:yaml_string])
result = Plist.parse_xml(params[:yaml_string], marshal: true)

# safe
result = Plist.parse_xml(params[:yaml_string], marshal: false)
end
end


Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
edges
| PlistUnsafeDeserialization.rb:5:30:5:35 | call to params | PlistUnsafeDeserialization.rb:5:30:5:49 | ...[...] |
| PlistUnsafeDeserialization.rb:6:30:6:35 | call to params | PlistUnsafeDeserialization.rb:6:30:6:49 | ...[...] |
| UnsafeDeserialization.rb:10:5:10:19 | serialized_data | UnsafeDeserialization.rb:11:27:11:41 | serialized_data |
| UnsafeDeserialization.rb:10:23:10:50 | call to decode64 | UnsafeDeserialization.rb:10:5:10:19 | serialized_data |
| UnsafeDeserialization.rb:10:39:10:44 | call to params | UnsafeDeserialization.rb:10:39:10:50 | ...[...] |
Expand Down Expand Up @@ -29,7 +31,21 @@ edges
| UnsafeDeserialization.rb:87:5:87:13 | yaml_data | UnsafeDeserialization.rb:88:25:88:33 | yaml_data |
| UnsafeDeserialization.rb:87:17:87:22 | call to params | UnsafeDeserialization.rb:87:17:87:28 | ...[...] |
| UnsafeDeserialization.rb:87:17:87:28 | ...[...] | UnsafeDeserialization.rb:87:5:87:13 | yaml_data |
| YAMLUnsafeDeserialization.rb:5:16:5:21 | call to params | YAMLUnsafeDeserialization.rb:5:16:5:35 | ...[...] |
| YAMLUnsafeDeserialization.rb:11:23:11:28 | call to params | YAMLUnsafeDeserialization.rb:11:23:11:42 | ...[...] |
| YAMLUnsafeDeserialization.rb:12:28:12:33 | call to params | YAMLUnsafeDeserialization.rb:12:28:12:45 | ...[...] |
| YAMLUnsafeDeserialization.rb:13:23:13:28 | call to params | YAMLUnsafeDeserialization.rb:13:23:13:42 | ...[...] |
| YAMLUnsafeDeserialization.rb:14:39:14:44 | call to params | YAMLUnsafeDeserialization.rb:14:39:14:58 | ...[...] |
| YAMLUnsafeDeserialization.rb:14:39:14:58 | ...[...] | YAMLUnsafeDeserialization.rb:15:5:15:24 | call to to_ruby |
| YAMLUnsafeDeserialization.rb:16:17:16:22 | call to params | YAMLUnsafeDeserialization.rb:16:17:16:36 | ...[...] |
| YAMLUnsafeDeserialization.rb:16:17:16:36 | ...[...] | YAMLUnsafeDeserialization.rb:16:5:16:45 | call to to_ruby |
| YAMLUnsafeDeserialization.rb:17:22:17:27 | call to params | YAMLUnsafeDeserialization.rb:17:22:17:39 | ...[...] |
| YAMLUnsafeDeserialization.rb:17:22:17:39 | ...[...] | YAMLUnsafeDeserialization.rb:17:5:17:48 | call to to_ruby |
nodes
| PlistUnsafeDeserialization.rb:5:30:5:35 | call to params | semmle.label | call to params |
| PlistUnsafeDeserialization.rb:5:30:5:49 | ...[...] | semmle.label | ...[...] |
| PlistUnsafeDeserialization.rb:6:30:6:35 | call to params | semmle.label | call to params |
| PlistUnsafeDeserialization.rb:6:30:6:49 | ...[...] | semmle.label | ...[...] |
| UnsafeDeserialization.rb:10:5:10:19 | serialized_data | semmle.label | serialized_data |
| UnsafeDeserialization.rb:10:23:10:50 | call to decode64 | semmle.label | call to decode64 |
| UnsafeDeserialization.rb:10:39:10:44 | call to params | semmle.label | call to params |
Expand Down Expand Up @@ -74,8 +90,27 @@ nodes
| UnsafeDeserialization.rb:98:24:98:32 | call to read | semmle.label | call to read |
| UnsafeDeserialization.rb:101:24:101:27 | call to gets | semmle.label | call to gets |
| UnsafeDeserialization.rb:104:24:104:32 | call to readlines | semmle.label | call to readlines |
| YAMLUnsafeDeserialization.rb:5:16:5:21 | call to params | semmle.label | call to params |
| YAMLUnsafeDeserialization.rb:5:16:5:35 | ...[...] | semmle.label | ...[...] |
| YAMLUnsafeDeserialization.rb:11:23:11:28 | call to params | semmle.label | call to params |
| YAMLUnsafeDeserialization.rb:11:23:11:42 | ...[...] | semmle.label | ...[...] |
| YAMLUnsafeDeserialization.rb:12:28:12:33 | call to params | semmle.label | call to params |
| YAMLUnsafeDeserialization.rb:12:28:12:45 | ...[...] | semmle.label | ...[...] |
| YAMLUnsafeDeserialization.rb:13:23:13:28 | call to params | semmle.label | call to params |
| YAMLUnsafeDeserialization.rb:13:23:13:42 | ...[...] | semmle.label | ...[...] |
| YAMLUnsafeDeserialization.rb:14:39:14:44 | call to params | semmle.label | call to params |
| YAMLUnsafeDeserialization.rb:14:39:14:58 | ...[...] | semmle.label | ...[...] |
| YAMLUnsafeDeserialization.rb:15:5:15:24 | call to to_ruby | semmle.label | call to to_ruby |
| YAMLUnsafeDeserialization.rb:16:5:16:45 | call to to_ruby | semmle.label | call to to_ruby |
| YAMLUnsafeDeserialization.rb:16:17:16:22 | call to params | semmle.label | call to params |
| YAMLUnsafeDeserialization.rb:16:17:16:36 | ...[...] | semmle.label | ...[...] |
| YAMLUnsafeDeserialization.rb:17:5:17:48 | call to to_ruby | semmle.label | call to to_ruby |
| YAMLUnsafeDeserialization.rb:17:22:17:27 | call to params | semmle.label | call to params |
| YAMLUnsafeDeserialization.rb:17:22:17:39 | ...[...] | semmle.label | ...[...] |
subpaths
#select
| PlistUnsafeDeserialization.rb:5:30:5:49 | ...[...] | PlistUnsafeDeserialization.rb:5:30:5:35 | call to params | PlistUnsafeDeserialization.rb:5:30:5:49 | ...[...] | Unsafe deserialization depends on a $@. | PlistUnsafeDeserialization.rb:5:30:5:35 | call to params | user-provided value |
| PlistUnsafeDeserialization.rb:6:30:6:49 | ...[...] | PlistUnsafeDeserialization.rb:6:30:6:35 | call to params | PlistUnsafeDeserialization.rb:6:30:6:49 | ...[...] | Unsafe deserialization depends on a $@. | PlistUnsafeDeserialization.rb:6:30:6:35 | call to params | user-provided value |
| UnsafeDeserialization.rb:11:27:11:41 | serialized_data | UnsafeDeserialization.rb:10:39:10:44 | call to params | UnsafeDeserialization.rb:11:27:11:41 | serialized_data | Unsafe deserialization depends on a $@. | UnsafeDeserialization.rb:10:39:10:44 | call to params | user-provided value |
| UnsafeDeserialization.rb:17:30:17:44 | serialized_data | UnsafeDeserialization.rb:16:39:16:44 | call to params | UnsafeDeserialization.rb:17:30:17:44 | serialized_data | Unsafe deserialization depends on a $@. | UnsafeDeserialization.rb:16:39:16:44 | call to params | user-provided value |
| UnsafeDeserialization.rb:23:24:23:32 | json_data | UnsafeDeserialization.rb:22:17:22:22 | call to params | UnsafeDeserialization.rb:23:24:23:32 | json_data | Unsafe deserialization depends on a $@. | UnsafeDeserialization.rb:22:17:22:22 | call to params | user-provided value |
Expand All @@ -91,3 +126,10 @@ subpaths
| UnsafeDeserialization.rb:98:24:98:32 | call to read | UnsafeDeserialization.rb:98:24:98:32 | call to read | UnsafeDeserialization.rb:98:24:98:32 | call to read | Unsafe deserialization depends on a $@. | UnsafeDeserialization.rb:98:24:98:32 | call to read | value from stdin |
| UnsafeDeserialization.rb:101:24:101:27 | call to gets | UnsafeDeserialization.rb:101:24:101:27 | call to gets | UnsafeDeserialization.rb:101:24:101:27 | call to gets | Unsafe deserialization depends on a $@. | UnsafeDeserialization.rb:101:24:101:27 | call to gets | value from stdin |
| UnsafeDeserialization.rb:104:24:104:32 | call to readlines | UnsafeDeserialization.rb:104:24:104:32 | call to readlines | UnsafeDeserialization.rb:104:24:104:32 | call to readlines | Unsafe deserialization depends on a $@. | UnsafeDeserialization.rb:104:24:104:32 | call to readlines | value from stdin |
| YAMLUnsafeDeserialization.rb:5:16:5:35 | ...[...] | YAMLUnsafeDeserialization.rb:5:16:5:21 | call to params | YAMLUnsafeDeserialization.rb:5:16:5:35 | ...[...] | Unsafe deserialization depends on a $@. | YAMLUnsafeDeserialization.rb:5:16:5:21 | call to params | user-provided value |
| YAMLUnsafeDeserialization.rb:11:23:11:42 | ...[...] | YAMLUnsafeDeserialization.rb:11:23:11:28 | call to params | YAMLUnsafeDeserialization.rb:11:23:11:42 | ...[...] | Unsafe deserialization depends on a $@. | YAMLUnsafeDeserialization.rb:11:23:11:28 | call to params | user-provided value |
| YAMLUnsafeDeserialization.rb:12:28:12:45 | ...[...] | YAMLUnsafeDeserialization.rb:12:28:12:33 | call to params | YAMLUnsafeDeserialization.rb:12:28:12:45 | ...[...] | Unsafe deserialization depends on a $@. | YAMLUnsafeDeserialization.rb:12:28:12:33 | call to params | user-provided value |
| YAMLUnsafeDeserialization.rb:13:23:13:42 | ...[...] | YAMLUnsafeDeserialization.rb:13:23:13:28 | call to params | YAMLUnsafeDeserialization.rb:13:23:13:42 | ...[...] | Unsafe deserialization depends on a $@. | YAMLUnsafeDeserialization.rb:13:23:13:28 | call to params | user-provided value |
| YAMLUnsafeDeserialization.rb:15:5:15:24 | call to to_ruby | YAMLUnsafeDeserialization.rb:14:39:14:44 | call to params | YAMLUnsafeDeserialization.rb:15:5:15:24 | call to to_ruby | Unsafe deserialization depends on a $@. | YAMLUnsafeDeserialization.rb:14:39:14:44 | call to params | user-provided value |
| YAMLUnsafeDeserialization.rb:16:5:16:45 | call to to_ruby | YAMLUnsafeDeserialization.rb:16:17:16:22 | call to params | YAMLUnsafeDeserialization.rb:16:5:16:45 | call to to_ruby | Unsafe deserialization depends on a $@. | YAMLUnsafeDeserialization.rb:16:17:16:22 | call to params | user-provided value |
| YAMLUnsafeDeserialization.rb:17:5:17:48 | call to to_ruby | YAMLUnsafeDeserialization.rb:17:22:17:27 | call to params | YAMLUnsafeDeserialization.rb:17:5:17:48 | call to to_ruby | Unsafe deserialization depends on a $@. | YAMLUnsafeDeserialization.rb:17:22:17:27 | call to params | user-provided value |
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
require 'yaml'
class UsersController < ActionController::Base
def example
# safe
Psych.load(params[:yaml_string])
Psych.load_file(params[:yaml_file])
Psych.parse_stream(params[:yaml_string])
Psych.parse(params[:yaml_string])
Psych.parse_file(params[:yaml_file])
# unsafe
Psych.unsafe_load(params[:yaml_string])
Psych.unsafe_load_file(params[:yaml_file])
Psych.load_stream(params[:yaml_string])
parse_output = Psych.parse_stream(params[:yaml_string])
parse_output.to_ruby
Psych.parse(params[:yaml_string]).to_ruby
Psych.parse_file(params[:yaml_file]).to_ruby

end
end