Description
Problem description
pd.Grouper
's base
argument is almost undocumented (link). Despite this, the argument is extremely useful when grouping a dataframe with uneven sampling rate with an arbitrary start time.
Take for example the case of a dataframe representing a 60 minutes experiment with uneven sampling rate starting at '12-01-2019 11:55:23.01938' that we want to split in six 10-minutes group. base
is our best ally here.
Despite this, in the current implementation base
represents a floating number of minutes. It's however quite hard to understand what this number refers to, is quite cumbersome to convert the above '11:55:23.01938' into a number of minutes and gives you little flexibility regarding when the grouping will start.
This can also cause rounding error issues as descibed here.
I suggest replacing (or extend) this allowing it to be a datetime object so that the edge of each group is simply
base + N * freq
where N is an integer number. This is much cleaner and easy to understand.