-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: Add docs for read_sql to avoid sql injection #56546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Add docs for read_sql to avoid sql injection #56546
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really matter for any other drivers? I think it is the responsibility of the driver to properly escape strings |
Thanks @WillAyd. Yeah, I also think it is the responsibility of the driver to properly escape strings. When I test with other drivers, I do see that some of them will raise exceptions if there are multiple statements in one query. But I think it would be nice to have a note/warning along with this API. |
I think a note is fine but directing users to use |
Removed the reference directing users to use |
pandas/io/sql.py
Outdated
@@ -644,6 +644,30 @@ def read_sql( | |||
read_sql_table : Read SQL database table into a DataFrame. | |||
read_sql_query : Read SQL query into a DataFrame. | |||
|
|||
Notes | |||
----- | |||
Using string interpolation (e.g. ``f-strings``, ``%-formatting``, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something along the lines of
pandas does not attempt to sanitize SQL statements; instead it simply forwards the statement you are executing to the underlying driver, which may or may not sanitize from there. Please refer to the underlying driver documentation for any details. Generally, be wary when accepting statements from arbitrary sources
Is all that we need to say
@WillAyd I do think we should be pointing users to use |
My hesitation in doing that is that we are making promises about params doing something that isn't true. I don't see anything in PEP 249 about params helping in this case and our code base does nothing to change that. Is the promise of that avoiding injection made somewhere else? I am still under the impression that that would be driver-specific behavior |
My latest feedback was that we should get rid of the |
I agree with Will's comment by just adding #56546 (comment) and showing an example of using Since pandas essentially uses |
Makes sense. I guess pandas users are still expected to use |
Even though I still think it would be useful to add some examples to tell users what kind of actions are not safe, and suggest them to use I changed the docs to the general warning mentioned in #56546 (comment) |
Thank you for the changes and adding the warning. I think adding an example of Maybe it would even be helpful to mention examples of drivers tbat do and don't sanitize params. I expected all them to do, I think it may be helpful for users to have an idea. In any case, great changes, if you are happy with the PR as is, happy to get it merged, but I think you can add what you mention about params. |
Yeah I would support an example in the |
Do you want to add an example with |
Yes, I added an example with In the example, I only say that using |
Thanks @erichxchen |
* add docs for read_sql to avoid sql injection * resolve formatting * re-word the docs * formatting * refactor the example due to the length constraints * remove reference directing users to use params * changed to the general warning * add the example for using params --------- Co-authored-by: Hanxin Chen <[email protected]>
Added examples and notes to the documents of
pandas.read_sql
(https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql.html) as a warning to avoid SQL injection.