-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Add numba engine to rolling/expanding.std/var #44461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
76aa92e
ENH: Add numba engine to rolling.var
mroeschke 450e601
Fix typing
mroeschke d9391fe
Add std, support multiple versions in numba args docstring
mroeschke 5b7b448
Fix tests for std
mroeschke f3e7e69
Replace issue number in whatsnew
mroeschke 91bd851
Add benchmarks
mroeschke 443d21e
Ensure args are keyword only
mroeschke 205576e
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke d718280
Split calls
mroeschke ec2664a
Fix ordering of parameters
mroeschke c20e6ae
fix doc ordering in expanding
mroeschke 0f4ed4f
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 0b42f61
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 3b0138c
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke afe6ae4
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 568cc06
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke eef2290
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 5b659d7
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 99d44cf
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 699cc8e
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 0e24636
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 63751f9
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 0ba4077
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke a128a78
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke 6202d65
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke c8ca055
Merge remote-tracking branch 'upstream/master' into enh/numba_var
mroeschke File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
from pandas.core._numba.kernels.mean_ import sliding_mean | ||
from pandas.core._numba.kernels.sum_ import sliding_sum | ||
from pandas.core._numba.kernels.var_ import sliding_var | ||
|
||
__all__ = ["sliding_mean", "sliding_sum"] | ||
__all__ = ["sliding_mean", "sliding_sum", "sliding_var"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
""" | ||
Numba 1D var kernels that can be shared by | ||
* Dataframe / Series | ||
* groupby | ||
* rolling / expanding | ||
|
||
Mirrors pandas/_libs/window/aggregation.pyx | ||
""" | ||
from __future__ import annotations | ||
|
||
import numba | ||
import numpy as np | ||
|
||
from pandas.core._numba.kernels.shared import is_monotonic_increasing | ||
|
||
|
||
@numba.jit(nopython=True, nogil=True, parallel=False) | ||
def add_var( | ||
val: float, nobs: int, mean_x: float, ssqdm_x: float, compensation: float | ||
) -> tuple[int, float, float, float]: | ||
if not np.isnan(val): | ||
nobs += 1 | ||
prev_mean = mean_x - compensation | ||
y = val - compensation | ||
t = y - mean_x | ||
compensation = t + mean_x - y | ||
delta = t | ||
if nobs: | ||
mean_x += delta / nobs | ||
else: | ||
mean_x = 0 | ||
ssqdm_x += (val - prev_mean) * (val - mean_x) | ||
return nobs, mean_x, ssqdm_x, compensation | ||
|
||
|
||
@numba.jit(nopython=True, nogil=True, parallel=False) | ||
def remove_var( | ||
val: float, nobs: int, mean_x: float, ssqdm_x: float, compensation: float | ||
) -> tuple[int, float, float, float]: | ||
if not np.isnan(val): | ||
nobs -= 1 | ||
if nobs: | ||
prev_mean = mean_x - compensation | ||
y = val - compensation | ||
t = y - mean_x | ||
compensation = t + mean_x - y | ||
delta = t | ||
mean_x -= delta / nobs | ||
ssqdm_x -= (val - prev_mean) * (val - mean_x) | ||
else: | ||
mean_x = 0 | ||
ssqdm_x = 0 | ||
return nobs, mean_x, ssqdm_x, compensation | ||
|
||
|
||
@numba.jit(nopython=True, nogil=True, parallel=False) | ||
def sliding_var( | ||
values: np.ndarray, | ||
start: np.ndarray, | ||
end: np.ndarray, | ||
min_periods: int, | ||
ddof: int = 1, | ||
) -> np.ndarray: | ||
N = len(start) | ||
nobs = 0 | ||
mean_x = 0.0 | ||
ssqdm_x = 0.0 | ||
compensation_add = 0.0 | ||
compensation_remove = 0.0 | ||
|
||
min_periods = max(min_periods, 1) | ||
is_monotonic_increasing_bounds = is_monotonic_increasing( | ||
start | ||
) and is_monotonic_increasing(end) | ||
|
||
output = np.empty(N, dtype=np.float64) | ||
|
||
for i in range(N): | ||
s = start[i] | ||
e = end[i] | ||
if i == 0 or not is_monotonic_increasing_bounds: | ||
for j in range(s, e): | ||
val = values[j] | ||
nobs, mean_x, ssqdm_x, compensation_add = add_var( | ||
val, nobs, mean_x, ssqdm_x, compensation_add | ||
) | ||
else: | ||
for j in range(start[i - 1], s): | ||
val = values[j] | ||
nobs, mean_x, ssqdm_x, compensation_remove = remove_var( | ||
val, nobs, mean_x, ssqdm_x, compensation_remove | ||
) | ||
|
||
for j in range(end[i - 1], e): | ||
val = values[j] | ||
nobs, mean_x, ssqdm_x, compensation_add = add_var( | ||
val, nobs, mean_x, ssqdm_x, compensation_add | ||
) | ||
|
||
if nobs >= min_periods and nobs > ddof: | ||
if nobs == 1: | ||
result = 0.0 | ||
else: | ||
result = ssqdm_x / (nobs - ddof) | ||
else: | ||
result = np.nan | ||
|
||
output[i] = result | ||
|
||
if not is_monotonic_increasing_bounds: | ||
nobs = 0 | ||
mean_x = 0.0 | ||
ssqdm_x = 0.0 | ||
compensation_remove = 0.0 | ||
|
||
return output |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at some point should group these doc-string changes together (maybe into a separate section ok too but not requried)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah once I add median I can group these better