Description
Context
For methods that accept UDFs, I've been looking at a number of issues that would involve long-standing behavior changes, some I feel are quite significant:
- .agg should always reduce ENH: groupby(...).agg should only accept reducers #35725
- Series.transform and DataFrame.transform should broadcast reduces (like groupby does) Behavior of new df.agg, df.transform and df.apply is very inconsistent #18103
- Series.apply and DataFrame.apply use agg for list/dict arguments
- Series.agg and DataFrame.agg use apply if argument is a UDF
- UDF method arguments are inconsistent API: Signature of UDF methods #40112
- Make groupby.apply dumb API: Should apply be smart? #39209
- groupby.apply includes grouping column(s) in computation unlike groupby.agg/groupby.transform
For some of these, it is difficult to emit a FutureWarning. For example, I'm not sure how we would emit a warning changing the behavior of agg for UDFs. I've been working toward consolidating the methods for apply/agg/transform et al in pandas.core.apply
, resolving consistencies as they are encountered. This is very much a work in progress, and I expect to encounter more.
Proposal
Add a new submodel pandas.core.homs
with HOMs standing for "Higher Order Methods". These are methods that specify a callable function as their primary argument. Also add a new, experimental, implementation behind the option use_hom_api
. Whenever possible, any changes will be contained within the pandas.core.homs
module. Progress can then be made without worrying about deprecation warnings and changing behaviors. When it's ready, we can then progress as:
- Add docs on the option classifying it as experimental.
- Classify the option as stable.
- Change the default of the option to no-default, emitting a FutureWarning when not set that the behavior of many methods will change and that users should use True.
- Change the default of the option to True, keeping the option to allow users to change back if necessary.
- Remove the option and old implementation.
Goals
- Consistent behaviors between different Higher Order Methods
- Consistent API between Higher Order Methods
- Excisability: Should the project die out, it needs to be easy excise the experimental code. All changes need to be behind
if get_option('use_homs_api'):
.