Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
class SimpleIndexer(pd.api.indexers.BaseIndexer):
''' Custom `Indexer` duplicating basic fixed length windowing '''
def get_window_bounds(self, num_values=0, min_periods=None, center=None, closed=None):
min_periods = self.window_size if min_periods is None else 0
end = np.arange(num_values, dtype=np.int64) + 1
start = end.copy() - self.window_size
#---- Clip to `min_periods`
start[start < 0] = min_periods
return (start, end)
x = pd.DataFrame({'a': [1.0,2.0,3.0,4.0,5.0] * 3}, index=[0]*5+[1]*5+[2]*5)
x
Out:
a
0 1.0
0 2.0
0 3.0
0 4.0
0 5.0
1 1.0
1 2.0
1 3.0
1 4.0
1 5.0
2 1.0
2 2.0
2 3.0
2 4.0
2 5.0
x.groupby(x.index).rolling(SimpleIndexer(window_size=3), min_periods=1).sum()
Out:
a
0 0 1.0
0 2.0
0 3.0
0 4.0
0 5.0
1 1 1.0
1 2.0
1 3.0
1 4.0
1 5.0
2 2 1.0
2 2.0
2 3.0
2 4.0
2 5.0
Problem description
groupby().rolling()
does not use custom indexer supplied in window=
likely as a result of #34052 where a GroupbyRollingIndexer
is used instead.
The output data is always the same as the input regardless of the windowing function.
1.0.5 does not have this behavior.
Expected Output
x.groupby(x.index).rolling(window=3, min_periods=1).sum()
Out:
a
0 0 1.0
0 3.0
0 6.0
0 9.0
0 12.0
1 1 1.0
1 3.0
1 6.0
1 9.0
1 12.0
2 2 1.0
2 3.0
2 6.0
2 9.0
2 12.0
Output of pd.show_versions()
pandas : 1.1.0
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2
setuptools : 49.2.1.post20200802
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 7.14.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.17.0
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : 0.50.1