Skip to content

CoW warning mode: enable chained assignment warning for DataFrame setitem in default mode #56230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -4206,15 +4206,14 @@ def __setitem__(self, key, value) -> None:
warnings.warn(
_chained_assignment_msg, ChainedAssignmentError, stacklevel=2
)
# elif not PYPY and not using_copy_on_write():
elif not PYPY and warn_copy_on_write():
if sys.getrefcount(self) <= 3: # and (
# warn_copy_on_write()
# or (
# not warn_copy_on_write()
# and self._mgr.blocks[0].refs.has_reference()
# )
# ):
elif not PYPY and not using_copy_on_write():
if sys.getrefcount(self) <= 3 and (
warn_copy_on_write()
or (
not warn_copy_on_write()
and any(b.refs.has_reference() for b in self._mgr.blocks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't special case between the warning and non warning mode, just

any(b.refs.has_reference() for b in self._mgr.blocks)

(another PR of mine has a utility for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from what I did for Series setitem (https://github.com/pandas-dev/pandas/pull/55522/files#diff-a5257444a1b322d619680fc77361cc6ea11ef36b363b4bb2289fdef0f41feb70R1231-R1233). I don't remember exactly what the reason was that I did it there, but it was to get some test passing. Will have a look again

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quickly checking which tests fail if I remove the special case for the warning mode. For example, then a case like df[mask]["col"] = 10 doesn't raise any warning. In the default mode, it also raise the SettingWithCopyWarning to alert you, but that SettingWithCopyWarning isn't present in the warning mode. And since mask gives a copy, there is no reference, and so with the has_reference() check, we wouldn't warn here.

I think it is good to keep at least one warning in this case. Of course, it's not super important, as in theory no-one should be developing in the warning mode (but only use that to check existing code). But since it's easy to always warn in CoW-warn mode, I thought better to do so (it's also a bit easier to test, otherwise I would have to special case the warning mode as the only one that does not raise the warning ;))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I didn't consider that the settingwithcopy warning is missing in the warning case, fine to keep in then

)
):
warnings.warn(
_chained_assignment_warning_msg, FutureWarning, stacklevel=2
)
Expand Down
24 changes: 22 additions & 2 deletions pandas/tests/copy_view/test_chained_assignment_deprecation.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
import numpy as np
import pytest

from pandas.errors import ChainedAssignmentError
from pandas.errors import (
ChainedAssignmentError,
SettingWithCopyWarning,
)

from pandas import DataFrame
from pandas import (
DataFrame,
option_context,
)
import pandas._testing as tm


Expand Down Expand Up @@ -85,3 +91,17 @@ def test_series_setitem(indexer, using_copy_on_write):
else:
assert record[0].category == FutureWarning
assert "ChainedAssignmentError" in record[0].message.args[0]


@pytest.mark.filterwarnings("ignore::pandas.errors.SettingWithCopyWarning")
@pytest.mark.parametrize(
"indexer", ["a", ["a", "b"], slice(0, 2), np.array([True, False, True])]
)
def test_frame_setitem(indexer, using_copy_on_write):
df = DataFrame({"a": [1, 2, 3, 4, 5], "b": 1})

extra_warnings = () if using_copy_on_write else (SettingWithCopyWarning,)

with option_context("chained_assignment", "warn"):
with tm.raises_chained_assignment_error(extra_warnings=extra_warnings):
df[0:3][indexer] = 10
6 changes: 4 additions & 2 deletions pandas/tests/indexing/multiindex/test_setitem.py
Original file line number Diff line number Diff line change
Expand Up @@ -548,7 +548,8 @@ def test_frame_setitem_copy_raises(
else:
msg = "A value is trying to be set on a copy of a slice from a DataFrame"
with pytest.raises(SettingWithCopyError, match=msg):
df["foo"]["one"] = 2
with tm.raises_chained_assignment_error():
df["foo"]["one"] = 2


def test_frame_setitem_copy_no_write(
Expand All @@ -563,7 +564,8 @@ def test_frame_setitem_copy_no_write(
else:
msg = "A value is trying to be set on a copy of a slice from a DataFrame"
with pytest.raises(SettingWithCopyError, match=msg):
df["foo"]["one"] = 2
with tm.raises_chained_assignment_error():
df["foo"]["one"] = 2

result = df
tm.assert_frame_equal(result, expected)
Expand Down
3 changes: 2 additions & 1 deletion pandas/tests/indexing/test_chaining_and_caching.py
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,8 @@ def test_detect_chained_assignment_undefined_column(
df.iloc[0:5]["group"] = "a"
else:
with pytest.raises(SettingWithCopyError, match=msg):
df.iloc[0:5]["group"] = "a"
with tm.raises_chained_assignment_error():
df.iloc[0:5]["group"] = "a"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These few test changes illustrate cases where you now get both SettingWithCopyWarning and the new chained assignment warning.

But for example the one above doesn't actually change df (so nobody should be doing that right now), so let's not care about the double warning?

(in theory I could specifically not raise the warning when the key is a single column name, because assigning to a column is never inplace and thus can never change behaviour)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah let's just live with both warnings, that shouldn't be something that users are doing anyway as you said


@pytest.mark.arm_slow
def test_detect_chained_assignment_changing_dtype(
Expand Down