Skip to content

Extending the grouper base argument #25226

Closed
@LucaAmerio

Description

@LucaAmerio

Problem description

pd.Grouper's base argument is almost undocumented (link). Despite this, the argument is extremely useful when grouping a dataframe with uneven sampling rate with an arbitrary start time.

Take for example the case of a dataframe representing a 60 minutes experiment with uneven sampling rate starting at '12-01-2019 11:55:23.01938' that we want to split in six 10-minutes group. base is our best ally here.

Despite this, in the current implementation base represents a floating number of minutes. It's however quite hard to understand what this number refers to, is quite cumbersome to convert the above '11:55:23.01938' into a number of minutes and gives you little flexibility regarding when the grouping will start.

This can also cause rounding error issues as descibed here.

I suggest replacing (or extend) this allowing it to be a datetime object so that the edge of each group is simply
base + N * freq
where N is an integer number. This is much cleaner and easy to understand.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions