Closed
Description
related #5172
Given this DataFrame, with an index containing the moment when DST changes (October 27th in the case of the "Europe/Paris" timezone):
index = pandas.date_range('2013-09-30', '2013-11-02', freq = '30Min', tz = 'UTC').tz_convert('Europe/Paris')
column_a = pandas.np.random.random(index.size)
column_b = pandas.np.random.random(index.size)
df = pandas.DataFrame({ "a": column_a, "b": column_b }, index = index)
Let's say I want to find the "min" and "max" values for each month:
df.resample("MS", how = { "a": "min", "b": "max" })
Here's the incorrect result:
a b
2013-09-01 00:00:00+02:00 0.015856 0.979541
2013-10-01 00:00:00+02:00 0.002039 0.999960
2013-10-31 23:00:00+01:00 NaN NaN
Same problem with a "W-MON" frequency:
a b
2013-09-30 00:00:00+02:00 0.015856 0.979541
2013-10-07 00:00:00+02:00 0.007961 0.999734
2013-10-14 00:00:00+02:00 0.002614 0.993354
2013-10-21 00:00:00+02:00 0.005655 0.999960
2013-10-27 23:00:00+01:00 NaN NaN
2013-11-03 23:00:00+01:00 NaN NaN
Whereas it works fine with a "D" frequency.
a b
...
2013-10-26 00:00:00+02:00 0.004645 0.983281
2013-10-27 00:00:00+02:00 0.030151 0.986827
2013-10-28 00:00:00+01:00 0.015891 0.981455
2013-10-29 00:00:00+01:00 0.024176 0.999306
...
Should I resample only the "a" column, it also works fine:
df["a"].resample("MS", how = "min")
2013-09-01 00:00:00+02:00 0.015856
2013-10-01 00:00:00+02:00 0.002039
2013-11-01 00:00:00+01:00 0.000747
Freq: MS, dtype: float64
Tested with latest pandas from GIT master.