Open
Description
I don't have a lot of familiarity of window ops, but thought this was an interesting question from our Reddit AMA:
# example DataFrame with 3 columns: date, id and a random value
dates = list(pd.date_range(start="2019-01-01", end="2019-12-01", freq="MS"))
length = len(dates)
n = 2
ids = sorted(list(range(n)) * length)
values = np.random.randint(low=0, high=10, size=length).tolist()
df = pd.DataFrame({"date": dates * n, "id": ids, "value": values * n})
print(df.head())
# date id value
# 0 2019-01-01 0 8
# 1 2019-02-01 0 8
# 2 2019-03-01 0 7
# 3 2019-04-01 0 2
# 4 2019-05-01 0 1
print(df.rolling(3, on="date")["value"].max().head())
# 0 NaN
# 1 NaN
# 2 8.0
# 3 8.0
# 4 7.0
# Name: value, dtype: float64
print(df.groupby("id").rolling(3, on="date")["value"].max().head())
# id date
# 0 2019-01-01 NaN
# 2019-02-01 NaN
# 2019-03-01 8.0
# 2019-04-01 8.0
# 2019-05-01 7.0
# Name: value, dtype: float64
When doing DataFrame.rolling
with a reduction, the operation is a transform. With this, should DataFrameGroupBy.rolling
then also be treated like a transform; i.e. not include the groupers or on
column in the result, but rather return the index of the original frame? Like other transforms, users can then use the original object if they want to recover the corresponding on
values or grouping data.
cc @mroeschke