Skip to content

Behavior of new df.agg, df.transform and df.apply is very inconsistent #18103

Open
@memeplex

Description

@memeplex

This looks really messy, it's not difficult to find out inconsistent behavior in all three methods. It seems to me that the dataframe implementation is too liberally blurring the distinction between agg, transform and apply.

In: df
Out: 
   x  y  z
0  1  1  4
1  1  2  5
2  2  3  6
3  2  1  4
4  3  2  5
5  3  3  6

In: gb = df.groupby('x')
  1. dataframe.apply accepts lists and dicts while groupby.apply doesn't:
In: df.apply(['sum', 'mean'])
Out: 
         x     y     z
sum   12.0  12.0  30.0
mean   2.0   2.0   5.0

In: gb.apply(['sum', 'mean'])
TypeError    
  1. dataframe.transform disallows aggregations while groupby.transforms broadcasts them to the original shape:
In: df.transform('sum')
ValueError: transforms cannot produce aggregated results

In: gb.transform('sum')
Out: 
   y   z
0  3   9
1  3   9
2  4  10
3  4  10
4  5  11
5  5  11
  1. dataframe.agg allows non-aggregations while groupby.agg doesn't:
In: df.agg(lambda x: x)
Out: 
   x  y  z
0  1  1  4
1  1  2  5
2  2  3  6
3  2  1  4
4  3  2  5
5  3  3  6

In: gb.agg(lambda x: x)
ValueError: cannot copy sequence with size 3 to array axis with dimension 2

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapBugGroupby

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions