Skip to content

API: Table-wise rolling / expanding / EWM function application #15095

Closed
@TomAugspurger

Description

@TomAugspurger

In #11603 (comment) (the main PR implementing the deferred API for rolling / expanding / ewm), we discussed how to specify table-wise applys. Groupby.apply(f) feeds the entire group (all columns) to f. For backwards-compatibility, .rolling(n).apply(f) needed to be column-wise.

#11603 (comment) mentions a possible API like what I added for .style

  • axis=0: apply to each column independently
  • axis=1: apply to each row independently
  • axis=None: apply the supplied function to the entire table

So it'd be df.rolling(n).apply(f, axis=None).
Do people like the axis=0 / 1 / None idiom? Is it obvious enough?

This is prompted by @josef-pkt's post on the mailinglist. Needing a rolling OLS.

An example:

In [2]: import numpy as np
   ...: import pandas as pd
   ...:
   ...: np.random.seed(0)
   ...: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 2)), columns=["A", "B"])
   ...: df
   ...:
Out[2]:
   A  B
0  5  0
1  3  3
2  7  9
3  3  5
4  2  4
5  7  6
6  8  8
7  1  6
8  7  7
9  8  1

For a concrete example, get the table-wise max (this is equivalent to df.rolling(4).max().max(1))

In [10]: df.rolling(4).apply(np.max, axis=None)
Out[10]:
0    NaN
1    NaN
2    NaN
3    9.0
4    9.0
5    9.0
6    8.0
7    8.0
8    8.0
9    8.0
dtype: float64

A real example is something like a rolling OLS:

import statsmodels.api as sm
f = lambda x: sm.OLS.from_formula('A ~ B', data=x).fit()  # wrong, but w/e

df.rolling(5).apply(f, axis=None)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions