-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
CLN: DataFrameGroupBy._cython_agg_general #30384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: DataFrameGroupBy._cython_agg_general #30384
Conversation
pandas/core/groupby/generic.py
Outdated
@@ -63,7 +64,7 @@ | |||
) | |||
from pandas.core.indexes.api import Index, MultiIndex, all_indexes_same | |||
import pandas.core.indexes.base as ibase | |||
from pandas.core.internals import BlockManager, make_block | |||
from pandas.core.internals import Block, BlockManager, make_block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if Block is only used for annotations, can you put it in a if TYPE_CHECKING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
pandas/core/groupby/generic.py
Outdated
@@ -1751,7 +1752,7 @@ def count(self): | |||
] | |||
blk = map(make_block, counted, loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while you're at it, can you rename blk
to blks
or something else clearly plural
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
pandas/core/groupby/generic.py
Outdated
@@ -1750,9 +1755,9 @@ def count(self): | |||
counted = [ | |||
lib.count_level_2d(x, labels=ids, max_bin=ngroups, axis=1) for x in val | |||
] | |||
blk = map(make_block, counted, loc) | |||
blocks = map(make_block, counted, loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillAyd is getting Block/BlockManager stuff out of here still something you're working on?
LGTM |
cc @WillAyd for a second pair of eyes |
pandas/core/groupby/generic.py
Outdated
@@ -1691,17 +1697,17 @@ def _wrap_transformed_output( | |||
|
|||
return result | |||
|
|||
def _wrap_agged_blocks(self, items, blocks): | |||
def _agg_blocks_to_frame(self, items: Index, blocks: "List[Block]") -> DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually prefer to keep this named _wrap_agged_blocks
since it offers the same functionality as the rest of the _wrap_*
functions in groupby, albeit with different input types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point, I wasn't aware of this systematic naming, I'll change it back. With the added typing we can now see better that a frame is returned
pandas/core/groupby/generic.py
Outdated
@@ -1691,17 +1697,17 @@ def _wrap_transformed_output( | |||
|
|||
return result | |||
|
|||
def _wrap_agged_blocks(self, items, blocks): | |||
def _wrap_agged_blocks(self, blocks: "List[Block]", items: Index) -> DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've inverted the argument order. I think it makes more sense to have the blocks come first, like in BlockManager
.
] | ||
blk = map(make_block, counted, loc) | ||
blocks = [make_block(val, placement=loc) for val, loc in zip(counted, locs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor cleanups above: pluralizing names + use list comprehension instead of map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i tend to prefer these too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this just be a generator expression or does that fail? I see this is equivalent to existing code but maybe adds unnecessary overhead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would work, but wouldn't make any difference because it's passed into the BlockManager where it's stored.
lgtm @jreback if you care to look |
a20386b
to
e9e6b56
Compare
thanks! |
Some small cleanups that make the code easier to read IMO.