Skip to content

BUG: DataFrame from query result should not give SettingWithCopyWarning #55451

Closed
@limwz01

Description

@limwz01

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
def main():
    df = pd.DataFrame({"v": [0, 1, 2]})
    df2 = df.query("v < 2")
    df2["v"] = 0
    print(df)
    df = pd.DataFrame({"v": [0, 1, 2]})
    df = df.query("v < 2")
    df["v"] = 0
    print(df)
main()

Issue Description

The query method already always seems to return a copy, and that should be expected. Writing to the result should not trigger a SettingWithCopyWarning, especially if the original DataFrame is already out of scope.

The first example shows this unexpected behaviour where the original DataFrame is still in scope, and the assignment statement causes a SettingWithCopyWarning. But as seen, the changes to the resulting DataFrame does not affect the original anyway, and the behaviour is consistent, so why should there be a warning?

In the second example, the reference count does not seem to work properly and the weakref to the original DataFrame is still alive, and hence we still get a SettingWithCopyWarning. This is definitely a bug, but I am proposing that we solve it by setting the _is_copy to None for all query() results.

Expected Behavior

No SettingWithCopyWarning should be produced.

Installed Versions

INSTALLED VERSIONS

commit : e86ed37
python : 3.11.2.final.0
python-bits : 64
OS : Linux

pandas : 2.1.1
numpy : 1.26.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions