Skip to content

Ruby: Decompression Bombs #13556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>Extracting Compressed files with any compression algorithm like gzip can cause to denial of service attacks.</p>
<p>Attackers can compress a huge file which created by repeated similiar byte and convert it to a small compressed file.</p>

</overview>
<recommendation>

<p>When you want to decompress a user-provided compressed file you must be careful about the decompression ratio or read these files within a loop byte by byte to be able to manage the decompressed size in each cycle of the loop.</p>
<p>Please read official RubyZip Documentation <a href="https://github.com/rubyzip/rubyzip/#size-validation">here</a></p>
</recommendation>
<example>
<p>Rubyzip: According to <a href="https://github.com/rubyzip/rubyzip/#reading-a-zip-file">official</a> Documentation</p>
<sample src="example_good.rb" />
<sample src="example_bad.rb" />
</example>
<references>

<li>
<a href="https://www.cvedetails.com/cve/CVE-2022-3759/">CVE-2023-22898</a>
<a href="https://gitlab.com/gitlab-org/gitlab/-/issues/379633">Gitlab issue</a>
</li>
<li>
<a href="https://www.bamsoftware.com/hacks/zipbomb/">A great research to gain more impact by this kind of attack</a>
</li>

</references>
</qhelp>
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
/**
* @name User-controlled file decompression
* @description User-controlled data that flows into decompression library APIs without checking the compression rate is dangerous
* @kind path-problem
* @problem.severity error
* @security-severity 7.8
* @precision high
* @id rb/user-controlled-data-decompression
* @tags security
* experimental
* external/cwe/cwe-409
*/

import ruby
private import codeql.ruby.Concepts
private import codeql.ruby.DataFlow

Check warning

Code scanning / CodeQL

Redundant import Warning

Redundant import, the module is already imported inside
DecompressionBombs
.
private import codeql.ruby.TaintTracking
import DecompressionBombs

/**
* A call to `IO.copy_stream`
*/
class IoCopyStream extends DataFlow::CallNode {
IoCopyStream() { this = API::getTopLevelMember("IO").getAMethodCall("copy_stream") }

DataFlow::Node getAPathArgument() { result = this.getArgument(0) }
}

module BombsConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }

predicate isSink(DataFlow::Node sink) { sink instanceof DecompressionBomb::Sink }

predicate isAdditionalFlowStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
any(DecompressionBomb::AdditionalTaintStep ats).isAdditionalTaintStep(nodeFrom, nodeTo)
or
exists(API::Node n | n = API::getTopLevelMember("File").getMethod("open") |
nodeFrom = n.getParameter(0).asSink() and
nodeTo = n.getReturn().asSource()
)
or
nodeFrom = nodeTo.(File::FileOpen).getAPathArgument()
or
exists(API::Node n | n = API::getTopLevelMember("StringIO").getMethod("new") |
nodeFrom = n.getParameter(0).asSink() and
nodeTo = n.getReturn().asSource()
)
or
nodeFrom = nodeTo.(IoCopyStream).getAPathArgument()
or
// following can be a global additional step
exists(DataFlow::CallNode cn |
cn.getMethodName() = "open" and
cn.getReceiver().getExprNode().getExpr() instanceof Ast::SelfVariableAccess
|
nodeFrom = cn.getArgument(0) and
nodeTo = cn
)
}
}

module Bombs = TaintTracking::Global<BombsConfig>;

import Bombs::PathGraph

from Bombs::PathNode source, Bombs::PathNode sink
where Bombs::flowPath(source, sink)
select sink.getNode(), source, sink, "This file Decompression depends on a $@.", source.getNode(),
"potentially untrusted source"
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
import codeql.ruby.AST
import codeql.ruby.frameworks.Files
import codeql.ruby.ApiGraphs
import codeql.ruby.DataFlow
import codeql.ruby.dataflow.RemoteFlowSources

module DecompressionBomb {
/**
* A abstract class responsible for extending new decompression sinks
*
* can be a path, stream of compressed data,
* or a call to function that use pipe
*/
abstract class Sink extends DataFlow::Node { }

/**
* The additional taint steps that need for creating taint tracking or dataflow.
*/
abstract class AdditionalTaintStep extends string {
AdditionalTaintStep() { this = "AdditionalTaintStep" }

/**
* Holds if there is a additional taint step between pred and succ.
*/
abstract predicate isAdditionalTaintStep(DataFlow::Node pred, DataFlow::Node succ);
}
}

module Zlib {
/**
* Gets a node of `Zlib::GzipReader` member
*
* Note that if you use the lower level Zip::InputStream interface, rubyzip does not check the entry sizes.
*/
private API::Node gzipReaderInstance() {
result = API::getTopLevelMember("Zlib").getMember("GzipReader")
}

/**
* A return values of following methods
* `Zlib::GzipReader.open`
* `Zlib::GzipReader.zcat`
* `Zlib::GzipReader.new`
*/
class DecompressionBombSink extends DecompressionBomb::Sink {
DecompressionBombSink() {
this = gzipReaderInstance().getMethod(["open", "new", "zcat"]).getReturn().asSource()
}
}

/**
* The additional taint steps that need for creating taint tracking or dataflow for `Zlib`.
*/
class AdditionalTaintStep extends DecompressionBomb::AdditionalTaintStep {
AdditionalTaintStep() { this = "AdditionalTaintStep" }

override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(API::Node openOrNewOrZcat |
openOrNewOrZcat = gzipReaderInstance().getMethod(["open", "new", "zcat"])
|
nodeFrom = openOrNewOrZcat.getParameter(0).asSink() and
nodeTo = openOrNewOrZcat.getReturn().asSource()
)
}
}
}

module ZipInputStream {
/**
* Gets a node of `Zip::InputStream` member
*
* Note that if you use the lower level Zip::InputStream interface, rubyZip does not check the entry sizes.
*/
private API::Node zipInputStream() {
result = API::getTopLevelMember("Zip").getMember("InputStream")
}

/**
* The methods
* `Zip::InputStream.read`
* `Zip::InputStream.extract`
*
* as source of decompression bombs, they need an additional taint step for a dataflow or taint tracking query
*/
class DecompressionBombSink extends DecompressionBomb::Sink {
DecompressionBombSink() {
this = zipInputStream().getMethod(["open", "new"]).getReturn().asSource()
}
}

/**
* The additional taint steps that need for creating taint tracking or dataflow for `Zip`.
*/
class AdditionalTaintStep extends DecompressionBomb::AdditionalTaintStep {
AdditionalTaintStep() { this = "AdditionalTaintStep" }

override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(API::Node openOrNew | openOrNew = zipInputStream().getMethod(["open", "new"]) |
nodeFrom = openOrNew.getParameter(0).asSink() and
nodeTo = openOrNew.getReturn().asSource()
)
}
}
}

module ZipFile {
/**
* Gets a node of `Zip::File` member
*/
API::Node zipFile() { result = API::getTopLevelMember("Zip").getMember("File") }

/**
* Gets nodes that have a `base` at the beginning
*/
API::Node rubyZipNode(API::Node base) {
result = base
or
result = rubyZipNode(base).getMethod(_)
or
result = rubyZipNode(base).getReturn()
or
result = rubyZipNode(base).getParameter(_)
or
result = rubyZipNode(base).getAnElement()
or
result = rubyZipNode(base).getBlock()
}
Comment on lines +115 to +127
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I can't use a predicate with a better performance like the following here:

API::Node rubyZipNode() {
    result = zipFile()
    or
    result = rubyZipNode().getMethod(_)
    or
    result = rubyZipNode().getReturn()
    or
    result = rubyZipNode().getParameter(_)
    or
    result = rubyZipNode().getAnElement()
    or
    result = rubyZipNode().getBlock()
  }

because lines 164 and 165 won't work for this then.


/**
* The return values of following methods
* `ZipIO.read`
* `ZipEntry.extract`
* A sanitizer exists inside the nodes which have `entry.size > someOBJ`
*/
class DecompressionBombSink extends DecompressionBomb::Sink {
DecompressionBombSink() {
exists(API::Node rubyZip | rubyZip = rubyZipNode(zipFile()) |
this = rubyZip.getMethod(["extract", "read"]).getReturn().asSource() and
not exists(
rubyZip.getMethod("size").getReturn().getMethod([">", " <", "<=", " >="]).getParameter(0)
)
)
}
}

/**
* The additional taint steps that need for creating taint tracking or dataflow for `Zip::File`.
*/
class AdditionalTaintStep extends DecompressionBomb::AdditionalTaintStep {
AdditionalTaintStep() { this = "AdditionalTaintStep" }

override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(API::Node zipFile | zipFile = zipFile().getMethod("open") |
nodeFrom = zipFile.getParameter(0).asSink() and
nodeTo = rubyZipNode(zipFile).getMethod(["extract", "read"]).getReturn().asSource()
)
}
}

predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(API::Node zipFile | zipFile = zipFile().getMethod("open") |
nodeFrom = zipFile.getParameter(0).asSink() and
nodeTo = rubyZipNode(zipFile).getMethod(["extract", "read"]).getReturn().asSource()
)
}
// /**
// * The additional taint steps that need for creating taint tracking or dataflow for `Zip::File`.
// */
// class AdditionalTaintStep1 extends DecompressionBomb::AdditionalTaintStep {
// AdditionalTaintStep1() { this = "AdditionalTaintStep" }
// override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
// exists(API::Node zipFile | zipFile = zipFile().getMethod("open") |
// nodeFrom = zipFile.getParameter(0).asSink() and
// nodeTo = zipFile.asSource()
// )
// }
// }
// API::Node rubyZipNode2() {
// result = zipFile().getMethod("open")
// or
// result = rubyZipNode2().getMethod(_)
// or
// result = rubyZipNode2().getReturn()
// or
// result = rubyZipNode2().getParameter(_)
// or
// result = rubyZipNode2().getAnElement()
// or
// result = rubyZipNode2().getBlock()
// }
// /**
// * The additional taint steps that need for creating taint tracking or dataflow for `Zip::File`.
// */
// class AdditionalTaintStep2 extends DecompressionBomb::AdditionalTaintStep {
// AdditionalTaintStep2() { this = "AdditionalTaintStep" }
// override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
// exists(API::Node zipFileOpen | zipFileOpen = rubyZipNode2() |
// nodeFrom = zipFileOpen.getReturn().asSource() and
// nodeTo = zipFileOpen.getMethod(["extract", "read"]).getReturn().asSource()
// )
// }
// }
// /**
// * The additional taint steps that need for creating taint tracking or dataflow for `Zip::File`.
// */
// class AdditionalTaintStep3 extends DecompressionBomb::AdditionalTaintStep {
// AdditionalTaintStep3() { this = "AdditionalTaintStep" }
// override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
// exists(API::Node zipFileOpen | zipFileOpen = rubyZipNode2() |
// nodeFrom = zipFileOpen.asCall() and
// nodeTo = zipFileOpen.getMethod(["extract", "read"]).getReturn().asSource()
// )
// }
// }
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# "Note that if you use the lower level Zip::InputStream interface, rubyzip does not check the entry sizes"
zip_stream = Zip::InputStream.new(File.open('file.zip'))
while entry = zip_stream.get_next_entry
# All required operations on `entry` go here.
end
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
MAX_FILE_SIZE = 10 * 1024**2 # 10MiB
MAX_FILES = 100
Zip::File.open('foo.zip') do |zip_file|
num_files = 0
zip_file.each do |entry|
num_files += 1 if entry.file?
raise 'Too many extracted files' if num_files > MAX_FILES
raise 'File too large when extracted' if entry.size > MAX_FILE_SIZE
entry.extract
end
end
Loading
Loading