Skip to content

CLN: DataFrameGroupBy._cython_agg_general #30384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

topper-123
Copy link
Contributor

Some small cleanups that make the code easier to read IMO.

@@ -63,7 +64,7 @@
)
from pandas.core.indexes.api import Index, MultiIndex, all_indexes_same
import pandas.core.indexes.base as ibase
from pandas.core.internals import BlockManager, make_block
from pandas.core.internals import Block, BlockManager, make_block
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if Block is only used for annotations, can you put it in a if TYPE_CHECKING

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@@ -1751,7 +1752,7 @@ def count(self):
]
blk = map(make_block, counted, loc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you're at it, can you rename blk to blks or something else clearly plural

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@@ -1750,9 +1755,9 @@ def count(self):
counted = [
lib.count_level_2d(x, labels=ids, max_bin=ngroups, axis=1) for x in val
]
blk = map(make_block, counted, loc)
blocks = map(make_block, counted, loc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd is getting Block/BlockManager stuff out of here still something you're working on?

@jbrockmendel
Copy link
Member

LGTM

@gfyoung gfyoung added Clean Typing type annotations, mypy/pyright type checking labels Dec 22, 2019
@jbrockmendel
Copy link
Member

cc @WillAyd for a second pair of eyes

@@ -1691,17 +1697,17 @@ def _wrap_transformed_output(

return result

def _wrap_agged_blocks(self, items, blocks):
def _agg_blocks_to_frame(self, items: Index, blocks: "List[Block]") -> DataFrame:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually prefer to keep this named _wrap_agged_blocks since it offers the same functionality as the rest of the _wrap_* functions in groupby, albeit with different input types

Copy link
Contributor Author

@topper-123 topper-123 Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point, I wasn't aware of this systematic naming, I'll change it back. With the added typing we can now see better that a frame is returned

@WillAyd WillAyd added this to the 1.0 milestone Dec 23, 2019
@@ -1691,17 +1697,17 @@ def _wrap_transformed_output(

return result

def _wrap_agged_blocks(self, items, blocks):
def _wrap_agged_blocks(self, blocks: "List[Block]", items: Index) -> DataFrame:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've inverted the argument order. I think it makes more sense to have the blocks come first, like in BlockManager.

]
blk = map(make_block, counted, loc)
blocks = [make_block(val, placement=loc) for val, loc in zip(counted, locs)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor cleanups above: pluralizing names + use list comprehension instead of map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tend to prefer these too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be a generator expression or does that fail? I see this is equivalent to existing code but maybe adds unnecessary overhead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would work, but wouldn't make any difference because it's passed into the BlockManager where it's stored.

@WillAyd
Copy link
Member

WillAyd commented Dec 23, 2019

lgtm @jreback if you care to look

@topper-123 topper-123 force-pushed the cleaupn_DataFrameGroupBy._cython_agg_general branch from a20386b to e9e6b56 Compare December 23, 2019 22:44
@jreback jreback merged commit 7a36c79 into pandas-dev:master Dec 24, 2019
@jreback
Copy link
Contributor

jreback commented Dec 24, 2019

thanks!

AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Dec 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants