Skip to content

TST: added test for pd.where overflow error GH31687 #49926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 28, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion pandas/tests/frame/indexing/test_where.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@
isna,
)
import pandas._testing as tm
from pandas._testing._hypothesis import OPTIONAL_ONE_OF_ALL
from pandas._testing._hypothesis import (
DATETIME_JAN_1_1900_OPTIONAL_TZ,
OPTIONAL_ONE_OF_ALL,
)


@pytest.fixture(params=["default", "float_string", "mixed_float", "mixed_int"])
Expand Down Expand Up @@ -1017,3 +1020,15 @@ def test_where_producing_ea_cond_for_np_dtype():
{"a": Series([pd.NA, pd.NA, 2], dtype="Int64"), "b": [np.nan, 2, 3]}
)
tm.assert_frame_equal(result, expected)


@pytest.mark.parametrize(
"replacement", [0.001, True, "snake", DATETIME_JAN_1_1900_OPTIONAL_TZ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DATETIME_JAN_1_1900_OPTIONAL_TZ is gonna be a LazyStrategy object here, which I guess isn't quite what you wanted, I think you'll need @given to use hypothesis tests
Not sure we need that though, just some date should be fine if you want to test using a date as a replacement

Also, the original issue used None, shall we include that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I didn't want a LazyStrategy object! I'll change that to a date, and also include None to align better with the original issue. I was also wondering if there was already a quick way to test all scalar types rather than having to use a long parametrized list. Or would that be overkill?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think just keeping the original snippet from the issue as a test should be fine

)
def test_where_int_overflow(replacement):
# GH 31687
df = DataFrame([[1.0, 2e19, "nine"], [np.nan, 0.1, None]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally for regression tests it's better to keep them as close as possible to the original - any reason to use 2e19 instead of 2e25?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I reduced it down to the smallest order of magnitude that still had the problem in pandas 1.4.1, but I don't have strong feelings about using that instead of the 2e25 from the original.

result = df.where(pd.notnull(df), replacement)
expected = DataFrame([[1.0, 2e19, "nine"], [replacement, 0.1, replacement]])

tm.assert_frame_equal(result, expected)