-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Fix Grouper with a datetime key in conjunction with another key #51414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cafdd17
to
46b1f15
Compare
@jbrockmendel @rhshadrach Can you please have a look and/or advise? Thanks. |
@@ -1085,6 +1085,9 @@ def groups(self): | |||
} | |||
return result | |||
|
|||
def __iter__(self) -> Iterator[Hashable]: | |||
return iter(self.groupings[0].grouping_vector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the parent class method wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I forgot to add description. Please, see the description of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to refactor the indices here to align better with the base class rather than override iter. But I'm not certain if that's a good idea and I'm okay with this fix. cc @jbrockmendel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel - friendly ping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. i had in mind something more like @rhshadrach's suggestion, that needing this here might indicate that the self.indices (which the parent class method iterates over) might be wrong
ffeab75
to
4269cf4
Compare
4269cf4
to
53777cd
Compare
@jbrockmendel I fixed all your comments. Can you please review the PR? Thanks. |
@jbrockmendel I am wondering whether you had time to have a look the PR? If not, should I add someone else to the PR to review it? Also, am I missing something or the tests a quite flaky?
|
Please be patient, there are a lot of PRs to review. The segfault is unrelated and is affecting all PRs at the moment. You can ignore it. |
@jbrockmendel have you had time to have a look? |
@jbrockmendel Any news, please? |
6ad38ad
to
66bf4c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm; I've opened #52468 to track a possible followup.
Thanks @juleek! |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Description of the bug
The crux of the issue was in
BaseGrouper::groups
:We are going to iterate over
to_groupby
, which is azip
ofping.grouping_vector
.groupings
are populated inget_grouper
, below is an excerpt of relevant code that we executed in case of grouping by date:after this,
Grouping::grouping_vector
isBinGrouper
, and this is whatzip()
will try to iterate over.Before the fix, when we were iterating over
BinGrouper
, its base class__iter__
was used, which was usingself.indices
, which was:When you iterate over a dictionary in Python, you iterate over the keys, and more importantly we have only two of them here. Second
Grouping
hasarray
in its grouping vector consisting of 6 elements:so, the
zip()
function just truncated it.Fix
By adding
__iter__
inBinGrouper
, thezip()
function now iterate over the elements fromself.groupings[0].grouping_vector
, which is, in our case: