-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: cache _get_cython_function in groupby ops #40178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: cache _get_cython_function in groupby ops #40178
Conversation
|
||
|
||
@functools.lru_cache(maxsize=None) | ||
def _get_cython_function(kind: str, how: str, dtype: np.dtype, is_numeric: bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is a straight cut&paste of the existing method. But moving it to a standalone function instead of method to be able to use lru_cache on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
LGTM |
# only valid for numeric | ||
f = getattr(libgroupby, ftype, None) | ||
if f is not None and is_numeric: | ||
return f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea how many cases make it past here? might be we can make some of this unnecessary on the cython side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite some I think. Eg for sum, we have group_add_..
which only has type-specific functions exposed (the same for mean, prod, var). Not directly sure why we expose some functions with numeric
fused type, and others we expose the type specific ones (because those are also coded with fused types). That's a bigger change for another PR, though ;)
(now, an expensive part of this function is dtype.name
, which is above this line anyway, so won't change that it's useful to cache this I think)
Looking up the exact cython function is relatively costly, and is also something that is done repeatedly for a limited number of possible cases, so a good target for caching.
Using the
GroupManyLabels
benchmark we have:pandas/asv_bench/benchmarks/groupby.py
Lines 133 to 145 in b835ca2
I get
(the difference is not huge, but I am going to do several PRs with small changes that give a few ms improvement. And with all those together I hope to get the total time down to around 20ms, and in that case 5ms is significant)