Skip to content

rolling_max on a TimeSeries with freq='D' returns incorrect results #6297

Closed
@sleibman

Description

@sleibman

rolling_max on a TimeSeries with freq='D' appears to actually compute the rolling mean, and not the rolling max.

In [118]: import pandas

In [119]: indices = [datetime.datetime(1975, 1, i, 12, 0) for i in range(1, 6)]

In [120]: indices.append(datetime.datetime(1975, 1, 3, 6, 0))  # So that we can have 2 datapoints on one of the days

In [121]: series = pandas.Series(range(1, 7), index=indices)

In [122]: series = series.map(lambda x: float(x))  # Use floats instead of ints as values

In [123]: series = series.sort_index()  # Sort chronologically

In [124]: expected_result = pandas.Series([1.0, 2.0, 6.0, 4.0, 5.0], index=[datetime.datetime(1975, 1, i, 12, 0) for i in range(1, 6)])

In [125]: actual_result = pandas.rolling_max(series, window=1, freq='D')

In [126]: assert((actual_result==expected_result).all())
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-126-cc436c4798a7> in <module>()
----> 1 assert((actual_result==expected_result).all())

AssertionError: 

In [127]: expected_result
Out[127]: 
1975-01-01 12:00:00    1
1975-01-02 12:00:00    2
1975-01-03 12:00:00    6
1975-01-04 12:00:00    4
1975-01-05 12:00:00    5
dtype: float64

In [128]: actual_result
Out[128]: 
1975-01-01    1.0
1975-01-02    2.0
1975-01-03    4.5
1975-01-04    4.0
1975-01-05    5.0
Freq: D, dtype: float64

With a window of size 2 days, it still looks like the rolling mean:

In [130]: pandas.rolling_max(series, window=2, freq='D')
Out[130]: 
1975-01-01    NaN
1975-01-02    2.0
1975-01-03    4.5
1975-01-04    4.5
1975-01-05    5.0
Freq: D, dtype: float64

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions