Skip to content

API: inconsistent return format of groupby apply #13056

@edfall

Description

@edfall

related issues:

Thank you very much for provide us pandas which is really a good packages I have used!

There were several times when I use pandas but get inconsistent return format which finally leads to a break!

  • for example:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(10,1), columns = ['x'])
df['y'] = 1
df.groupby('y').apply(lambda x: x.x)

#x        0         1         2         3         4         5         6  \
#y                                                                        
#1 -1.12114  0.679616 -1.392863  0.032637 -0.051134  0.594201 -0.238833   

#x        7        8         9  
#y                              
#1  0.95173  1.07469 -0.062198  

df2 = df.copy()
df2['y'] = np.random.randint(1,3,10)
# do something similar like above
df2.groupby('y').apply(lambda x: x.x)
#y   
#1  3    0.032637
#   8    1.074690
#2  0   -1.121140
#   1    0.679616
#   2   -1.392863
#   4   -0.051134
#   5    0.594201
#   6   -0.238833
#   7    0.951730
#   9   -0.062198
#Name: x, dtype: float64

even though, I knew the way to avoid this issue, I still believe it is better to return the same format. It's really annoying when some inconsistent return format happened which force me to look backwards to use some awkward methods to avoid it.

So, if possible, I do suggest the above function will only return one kind of format.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapBugGroupbyMaster TrackerHigh level tracker for similar issues

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions