Skip to content

TST: groupby apply called multiple times #34897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 20, 2020
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions pandas/tests/groupby/test_apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -974,3 +974,21 @@ def test_apply_function_with_indexing_return_column():
result = df.groupby("foo1", as_index=False).apply(lambda x: x.mean())
expected = DataFrame({"foo1": ["one", "three", "two"], "foo2": [3.0, 4.0, 4.0]})
tm.assert_frame_equal(result, expected)


def test_apply_function_called_count(capsys):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this near test_group_apply_once_per_group and indicate the same issue number (as well)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and rename these test to test_group_apply_once_per_group2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright @jreback . Will change it accordingly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove GH: 31111 and use # GH2936, GH7739, GH10519, GH2656, GH12155, GH20084, GH21417 @jreback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add this issue number as well is ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. fixed accordingly

# GH: 31111
# groupby-apply need to execute len(set(group_by_columns)) times
# `https://github.com/pandas-dev/pandas/issues/31111`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you already have the issue number you don't need the link as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. I understood that. Will make sure not to repeat this again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed accordingly. @MarcoGorelli



function_called_count = 2 # Number of times `apply` should call a function for the current test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps call this expected, and call capsys.readouterr().out.count("function_called") result, and at the end do assert result == expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I will make it more readable.


df = pd.DataFrame({"group_by_column": [0, 0, 0, 0, 1, 1, 1, 1],
"test_column": ["0", "2", "4", "6", "8", "10", "12", "14"]},
index=["0", "2", "4", "6", "8", "10", "12", "14"])

df.groupby('group_by_column').apply(lambda df:print("function_called"))

# If `groupby` behaves unexpectedly, this test will break
assert capsys.readouterr().out.count("function_called") == function_called_count