Skip to content

BUG: Resampling PeriodIndex-ed to multiple of frequencies not working as expected #15944

Closed
@winklerand

Description

@winklerand

Code Sample, a copy-pastable example if possible

In [1]: import numpy as npIn [2]: import pandas as pdIn [3]: s = pd.Series([2017, 2018], index=pd.period_range('2017', freq='A', periods=2))


In [4]: s.resample('2Q', kind='period').ffill()
Warning: multiple of frequency -> timestamps
Out[4]: 
2017-03-31    2017
2017-09-30    2017
2018-03-31    2018
Freq: 2Q-DEC, dtype: int64

In [7]: s2 = pd.Series(np.arange(12), index=pd.period_range('2017-01', freq='M', periods=12))


In [8]: s2.resample('2Q', kind='period').mean()
Warning: multiple of frequency -> timestamps
Out[8]: 
2017-03-31     1.0
2017-09-30     5.5
2018-03-31    10.0
Freq: 2Q-DEC, dtype: float64

To compare with, results for resampling to base frequency (no multiples) returning PeriodIndex-ed and covering the full original time span:

In [5]: s.resample('Q', kind='period').ffill()
Out[5]: 
2017Q1    2017
2017Q2    2017
2017Q3    2017
2017Q4    2017
2018Q1    2018
2018Q2    2018
2018Q3    2018
2018Q4    2018
Freq: Q-DEC, dtype: int64

In [9]: s2.resample('Q', kind='period').mean()
Out[9]: 
2017Q1     1
2017Q2     4
2017Q3     7
2017Q4    10
Freq: Q-DEC, dtype: int64

Problem description

  • I'd expect resampling PeriodIndex-ed series/dataframes would return a PeriodIndex-ed result by default, even more when givenkind='period'.
  • Moreover, I'd expect the original full time span to be covered by the resampling result (upsampling A->2Q would return 2 periods per year, downsampling M->2Q would return 2 periods per 12 months).

As indicated by the warning message, resampling to multiple of frequencies falls back to timestamp-based resampling. Both work fine when resampling to a "base" frequency without any multiple.

Expected Output

In [4]: s.resample('2Q', kind='period').ffill()
Out[4]:
2017Q1    2017
2017Q3    2017
2018Q1    2018
2018Q3    2018
Freq: 2Q-DEC, dtype: int64

In [8]: s2.resample('2Q', kind='period').mean()
Out[8]:
2017Q1    2.5
2017Q3    8.5
Freq: 2Q-DEC, dtype: float64

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: None
numpy: 1.12.1
scipy: 0.19.0
statsmodels: 0.8.0
xarray: None
IPython: 5.3.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.6.2
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.3
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Duplicate ReportDuplicate issue or pull requestPeriodPeriod data typeResampleresample method

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions