Skip to content

Move GroupBy to Submodule and Add FutureWarning #20506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Apr 2, 2018

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Mar 27, 2018

"Grouper", "groupby", "BinGrouper", "Grouper", "_GroupBy", "GroupBy",
"SeriesGroupBy", "_pipe_template", "PanelGroupBy", "Grouping",
"SpecificationError", "DataError", "generate_bins_generic"])
def test_groupby_move(self, attr):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be straightforward to combine with the method above if we wanted one test accepting a module / list of objects to test the deprecation for

@WillAyd
Copy link
Member Author

WillAyd commented Mar 28, 2018

There was a failure in one of the "Allowed Failures" as a result of a ResourceWarning for an open TextIOBuffer (?). I don't believe that is related to this change and it didn't appear in any of the previous CI runs

stacklevel=2)

from pandas.core.groupby.groupby import Grouper # noqa: F401
from pandas.core.groupby.groupby import groupby # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u import these in 1 statement

from pandas.core.groupby.groupby import GroupBy # noqa: F401
from pandas.core.groupby.groupby import SeriesGroupBy # noqa: F401
from pandas.core.groupby.groupby import _pipe_template # noqa: F401
from pandas.core.groupby.groupby import PanelGroupBy # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all these imports actually needed?

just change where they r used

@jreback
Copy link
Contributor

jreback commented Mar 28, 2018

There was a failure in one of the "Allowed Failures" as a result of a ResourceWarning for an open TextIOBuffer (?). I don't believe that is related to this change and it didn't appear in any of the previous CI runs

FYI there are NO allowed failures, meaning that yes these will allow the build to complete earlier, but generally we don't merge things if they are causing issues.


from pandas.core.groupby.groupby import Grouper # noqa: F401
from pandas.core.groupby.groupby import groupby # noqa: F401
from pandas.core.groupby.groupby import BinGrouper # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only put public functions here. If something is needed internally then change the reference to it.

@jreback jreback added Refactor Internal refactoring of code Groupby labels Mar 28, 2018
@codecov
Copy link

codecov bot commented Mar 28, 2018

Codecov Report

Merging #20506 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20506      +/-   ##
==========================================
+ Coverage   91.82%   91.84%   +0.02%     
==========================================
  Files         152      153       +1     
  Lines       49255    49256       +1     
==========================================
+ Hits        45226    45239      +13     
+ Misses       4029     4017      -12
Flag Coverage Δ
#multiple 90.23% <100%> (+0.02%) ⬆️
#single 41.9% <66.66%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/groupby/groupby.py 92.55% <ø> (ø)
pandas/core/reshape/pivot.py 96.97% <100%> (ø) ⬆️
pandas/core/resample.py 96.43% <100%> (ø) ⬆️
pandas/core/api.py 100% <100%> (ø) ⬆️
pandas/core/groupby/__init__.py 100% <100%> (ø)
pandas/core/generic.py 95.94% <100%> (ø) ⬆️
pandas/core/panel.py 97.29% <100%> (ø) ⬆️
pandas/plotting/_converter.py 66.81% <0%> (+1.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4efb39f...7624313. Read the comment docs.

@TomAugspurger
Copy link
Contributor

FYI, merge conflict with master.

@WillAyd
Copy link
Member Author

WillAyd commented Mar 29, 2018

Hmm so I'm not sure that the FutureWarning being in the __init__.py of the groupby package is ideal here because the internal references point to pandas.core.groupby.groupby. I think warning in that init will always get thrown, because the new module is part of the package where that __init__.py exists, so it would always get loaded?

pandas.core.categorical didn't have this problem because the updated reference was higher in the package hierarchy. The deprecated module doesn't get loaded when using the newer reference. Thinking through what our options here are but open to ideas

@jorisvandenbossche
Copy link
Member

Can we just import what is needed in __init__.py? Then no deprecation warning is needed (python doesn't care if it is a single file module or a package directory, so we can make use of that).
Personally I would even also use those imports internally. Then you don't need to change all the imports in the rest of pandas, if you reorganize things internally within the groupby subpackage.

@WillAyd
Copy link
Member Author

WillAyd commented Mar 29, 2018

@jorisvandenbossche I already updated all the internal refs per Jeff's previous comment, but could undo if required. That said, maybe we just shouldn't put the warning on the init given the change could be transparent from an API perspective since all we are doing is moving from a single module to a like-named package?

"'pandas.core.groupby.groupby' instead", FutureWarning,
stacklevel=2)

from pandas.core.groupby.groupby import (Grouper, GroupBy, # noqa: F401
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GroupBy, SeriesGroupBy and DataFrameGroupBy were all added here as they were causing downstream failures for Dask, even though they aren't explicitly referenced internally. Once we settle on a go-forward here we'll have to follow up with the Dask team to let them know of impacts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is fine. These are basically downstream dev dependencies (e.g. you may want to do something based on the object if its a SeriesGroupBy or a DataFrameGroupBy) so ok with this

@jorisvandenbossche
Copy link
Member

That said, maybe we just shouldn't put the warning on the init given the change could be transparent from an API perspective since all we are doing is moving from a single module to a like-named package?

Yes, that's exactly what I tried to say :-)

IMO, this is much cleaner (also internally, but certainly for external packages)

"'pandas.core.groupby.groupby' instead", FutureWarning,
stacklevel=2)

from pandas.core.groupby.groupby import (Grouper, GroupBy, # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is fine. These are basically downstream dev dependencies (e.g. you may want to do something based on the object if its a SeriesGroupBy or a DataFrameGroupBy) so ok with this

from pandas.core.groupby import (BinGrouper, Grouper, _GroupBy, GroupBy,
SeriesGroupBy, groupby, PanelGroupBy,
_pipe_template)
from pandas.core.groupby.groupby import (BinGrouper, Grouper, _GroupBy,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit, I like to see a newline after the parens so these are wrapped a bit

@jreback
Copy link
Contributor

jreback commented Mar 30, 2018

very tiny style issue, ping on green.

@jorisvandenbossche
Copy link
Member

@jreback did you see our discussion above?

So -1 on merge on green. There is no need to deprecate this (and we can still move to a subpackage)

@jreback
Copy link
Contributor

jreback commented Mar 30, 2018

@jorisvandenbossche sorry, I did see that. yeah we can not deprecate is fine.

@WillAyd
Copy link
Member Author

WillAyd commented Mar 31, 2018

Does anyone develop on Windows? Trying to figure out why CircleCI is failing but can't reproduce the issue locally.

An easy way to solve it would be to add BinGrouper back to the API in the __init__.py but that doesn't seem required for anything else internal, so figure it's best to figure out what is different about the Windows import mechanism here

@jorisvandenbossche
Copy link
Member

CircleCI is also linux, not windows. But why it is failing there and not in the others, I don't know.
But you can always import from pandas. .... import BinGrouper and then test against that, that should certainly work.

I would also like to advocate to keep the internal imports the same as before (so import into __init__.py everything that is "public" for the rest of pandas)

@WillAyd
Copy link
Member Author

WillAyd commented Apr 1, 2018

@jreback what do you think about Joris' comment? Should I put everything back into the __init__.py and undo the changes to internal references or simply update this last failing test case?

@jreback
Copy link
Contributor

jreback commented Apr 1, 2018

i would move non public this out of the init

@jreback jreback merged commit 5edc5c4 into pandas-dev:master Apr 2, 2018
@jreback
Copy link
Contributor

jreback commented Apr 2, 2018

thanks @WillAyd

@WillAyd WillAyd deleted the grpby-sub-mod branch May 14, 2018 21:11
casperdcl pushed a commit to tqdm/tqdm that referenced this pull request May 21, 2018
`pandas.core.groupby` has been moved to `pandas.core.groupby.groupby` for pandas 0.23.0.  See pandas pull request [#20506](pandas-dev/pandas#20506).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants