Skip to content

ENH: .pipe on Resampler #17905

Closed
Closed
@topper-123

Description

@topper-123

I've made a PR(#17871) to get pipe funcionality on GroupBy objects.

Resampler should IMO have the same .pipe behaviour as GroupBy objects. Then you could do the below in a single pass:

df.resample('3M').pipe(lambda resampled: resampled.Open.first() - resampled.close.last())

That is, for each resampled period, you could reuse a Resampler objects multiple time in a pipe. The alternative would be to do it in several lines, which would be less readable, or to use apply, which would be slower.

Currently, however, Resampler.pipe/DatetimeResampler.pipe implicitly converts to a dataframe of mean before piping, which to me seems wrong/unintuitive:

>>> df.resample('3M').pipe(lambda x: x.max() - x.min())
C:\Users\TP\Anaconda3\envs\pandasdev\Scripts\ipython-script.py:1: FutureWarning:
.resample() is now a deferred operation
You called pipe(...) on this deferred object which materialized it into a dataframe
by implicitly taking the mean.  Use .resample(...).mean() instead

To demonstrate:

First set-up:

>>> d = = pd.date_range('2017-01-01', periods=4)
>>> df = pd.DataFrame(dict(B=[1,2,3, 4]), index=d)
>>> r = df.resample('2D')

if we call pipe we get an uexpected result:

>>> r.pipe(lambda x: x.max() - x.min())
B    2.0
dtype: float64

The reason for the unexpected result is that under the hood the above is:

>>> r.mean().pipe(lambda x: x.max() - x.min())
B    2.0
dtype: float64

Expected would be:

>>> def diff(r):
...:       return r.max() - r.mean()
>>> diff(r)   # r.pipe(diff) should give the same result as this 
              B
2017-01-01  0.5
2017-01-03  0.5

IMO, if the user wants the mean before piping, he should just himself call mean before pipe.

Adding a pipe was very easy for GroupBy and has very good use cases. I propose adding a (proper) pipe to Resampler also.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions