Description
Previously, for resampling with PeriodIndex, you had two conventions: 'start'
(start -> start) and 'end'
(end -> end). This would give something like this (note: the following is not current real code output, but from Wes' book):
In [25]: s = pd.Series(np.arange(2), index=pd.period_range('2000-1', periods=2, freq='A'))
In [26]: s
Out[26]:
2000 0
2001 1
Freq: A-DEC, dtype: int32
In [27]: s.resample('Q-DEC', fill_method='ffill', convention='start')
Out[27]:
2000Q1 0
2000Q2 0
2000Q3 0
2000Q4 0
2001Q1 1
Freq: Q-DEC, dtype: int32
In [28]: s.resample('Q-DEC', fill_method='ffill', convention='end')
Out[27]:
2000Q4 0
2001Q1 1
2001Q2 1
2001Q3 1
2001Q4 1
Freq: Q-DEC, dtype: int32
Following Wes' book, the default argument was 'end'. However, the current behaviour is like this (this is real output):
In [27]: s.resample('Q-DEC', fill_method='ffill')
Out[27]:
2000Q1 0
2000Q2 0
2000Q3 0
2000Q4 0
2001Q1 1
2001Q2 1
2001Q3 1
2001Q4 1
Freq: Q-DEC, dtype: int32
So in fact this is a third option 'span'
(start -> end). This option is mentioned in #1635, but from the issue it seems it was never implemented (the commit was never merged. There was a test added in comments at that time, but this is still in comments: https://github.com/pydata/pandas/blob/master/pandas/tseries/tests/test_resample.py#L1134).
In practice, however, this is the case (the default behaviour is this mentioned 'span' behaviour). But also the option 'start'
has changed:
In [28]: s.resample('Q-DEC', fill_method='ffill', convention='start')
Out[28]:
2000Q1 0
2000Q2 0
2000Q3 0
2000Q4 0
2001Q1 1
2001Q2 1
2001Q3 1
2001Q4 1
Freq: Q-DEC, dtype: int32
This gives the same as the default (only for 'end'
it is the same as before).
Some issues/questions:
- what is the default value for
convention
? It is nowhere in the docs, and also not in the docstring (apart from the signature, which says 'start'). - I don't find the issue/PR/release note where it says that the default for period resample (upsampling) has changed
- the default now is a 'spanning' behaviour, but this is the same as 'start'. Shouldn't be this something else? So that the 'start' option has another behaviour (start -> start) than the default spanning behaviour ('start' -> 'end')?