-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REF: Fuse all the types #23022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: Fuse all the types #23022
Conversation
Hello @jbrockmendel! Thanks for submitting the PR.
|
pandas/core/internals/blocks.py
Outdated
@@ -1153,7 +1153,7 @@ def check_int_bool(self, inplace): | |||
inplace=inplace, limit=limit, | |||
fill_value=fill_value, | |||
coerce=coerce, | |||
downcast=downcast, mgr=mgr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edits here are unrelated, should be removed from this PR.
int64_t lab | ||
|
||
N, K = (<object> values).shape | ||
accum = np.empty_like(values) | ||
accum.fill({{inf_val}}) | ||
if groupby_t is int64_t: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u make this more generic ? she what if we expand this to other int types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably. The MO with these PRs is to keep the logic unchanged.
I think there is also a cost in compile-time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a small cost of compile time (actually maybe nothing as cython is pretty smart). but i suppose can handle later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Easy to implement if/when its actually needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see some comments above
Codecov Report
@@ Coverage Diff @@
## master #23022 +/- ##
=======================================
Coverage 92.19% 92.19%
=======================================
Files 169 169
Lines 50959 50959
=======================================
Hits 46980 46980
Misses 3979 3979
Continue to review full report at Codecov.
|
Just pushed, moving some more functions over. I think I should stop here for this PR before the diff gets out of hand. Made a bug report to cython about a mysterious failure. |
Thoughts here? |
ndarray[int64_t] indexer, Py_ssize_t loc, | ||
ndarray[{{dest_type2}}] out): | ||
def put2d_{{name}}_{{dest_type[:-2]}}(ndarray[{{c_type}}, ndim=2, cast=True] values, | ||
ndarray[int64_t] indexer, Py_ssize_t loc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a it obtuse can u make it more explicit (the slice)
@@ -638,7 +623,12 @@ def group_max_{{name}}(ndarray[{{dest_type2}}, ndim=2] out, | |||
nobs = np.zeros_like(out) | |||
|
|||
maxx = np.empty_like(out) | |||
maxx.fill(-{{inf_val}}) | |||
if groupby_t is int64_t: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u comment / add a Todo here
maxx[lab, j] = val | ||
else: | ||
if val == val and val != nan_val: | ||
nobs[lab, j] += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should have a function that does the null comparisons though with the template code it is slightly
tricky, maybe have a suite of isna_int, isna_float functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should have a function that does the null comparisons though with the template code
I've been thinking about something similar, will give it a go in the next pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
else: | ||
out[i, j] = maxx[i, j] | ||
|
||
|
||
group_max_float64 = group_max["float64_t"] | ||
group_max_float32 = group_max["float32_t"] | ||
group_max_int64 = group_max["int64_t"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we DO need to expand these to all int/unit dtypes
FYI
int64_t lab | ||
|
||
N, K = (<object> values).shape | ||
accum = np.empty_like(values) | ||
accum.fill({{inf_val}}) | ||
if groupby_t is int64_t: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see some comments above
pandas/_libs/sparse_op_helper.pxi.in
Outdated
'mod': '__mod_{2}({0}, {1})', | ||
'truediv': '__truediv_{2}({0}, {1})', | ||
'floordiv': '__floordiv_{2}({0}, {1})', | ||
'div': '__div({0}, {1})', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these names are odd
maybe just call them div and so on
@datapythonista I don’t understand what is causing the travis failure. Do you recognize it? |
I'mc checking from my cell phone which makes it tricky to check, but if I see it correctly, this is the error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small question, other lgtm.
thanks! |
Everything is passing locally, just want to run this through the CI for good measure before continuing down this path.