Skip to content

BUG: Groupby with as_index=True causes incorrect summarization #34906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

vivikelapoutre
Copy link
Contributor

@vivikelapoutre vivikelapoutre commented Jun 20, 2020

@vivikelapoutre
Copy link
Contributor Author

Given that this is my first commit to pandas, I'm not quite sure whether what I'm doing is what is expected. I just added the example from the bugreport as a test, is that ok? Should additional things be tested? Tested differently? Please, let me know.

@jreback jreback changed the title add test BUG: Groupby with as_index=True causes incorrect summarization Jun 20, 2020
@jreback jreback added Groupby Testing pandas testing functions or related to the test suite labels Jun 20, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to go with other min tests, put in test_function.py but find a similar test.

)

tm.assert_series_equal(
df.groupby("b")["c"].min(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use

result=
expected=

tm.assert_series_equal(
df.groupby("b")["c"].min(),
df.groupby("b", as_index=False)["c"].min()["c"],
check_index_type=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have to set check_index_type or check_names the something is wrong with the expected here.

date_series = pd.Series(dates)
date_series_parsed = pd.to_datetime(date_series, format="%Y-%m-%d").dt.date

df = pd.DataFrame(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls simplify this construction, e.g. directly construct the nputs

@vivikelapoutre
Copy link
Contributor Author

I addressed the comments, could you please have a look again?

@vivikelapoutre vivikelapoutre requested a review from jreback June 28, 2020 20:25
@vivikelapoutre
Copy link
Contributor Author

I don't know why there suddenly are test failures, the particular test that failed (pandas/tests/base/test_unique.py::test_unique_bad_unicode), passes on my setup and is nowhere near the test that I added.

@TomAugspurger
Copy link
Contributor

@vivikelapoutre can you try merging master & repushing to see if that fixes it?

@vivikelapoutre
Copy link
Contributor Author

@vivikelapoutre can you try merging master & repushing to see if that fixes it?

Thank you, that worked indeed.

@TomAugspurger
Copy link
Contributor

Thanks @vivikelapoutre!

@TomAugspurger TomAugspurger added this to the 1.1 milestone Jul 16, 2020
@vivikelapoutre vivikelapoutre deleted the GH26321-groupby_with_as_index_summarization branch July 16, 2020 18:02
fangchenli pushed a commit to fangchenli/pandas that referenced this pull request Jul 16, 2020
…s-dev#34906)

* add test

* PR comments

* attempt to make the code cleaner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Groupby with as_index=True causes incorrect summarization
3 participants