Skip to content

BUG: Applying function on column of Groupby object with as_index=False does not select column #5764

Closed
@jorisvandenbossche

Description

@jorisvandenbossche
>>> df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])
>>> df
   A  B
0  1  2
1  1  4
2  5  6
[3 rows x 2 columns]

Selecting a column of the GroupBy object, still returns all columns:

>>> g = df.groupby('A', as_index=False)['B']
>>> g.get_group(1)
   A  B
0  1  2
1  1  4
[2 rows x 2 columns]
>>> g = df.groupby('A', as_index=False)
>>> g.get_group(1)
   A  B
0  1  2
1  1  4
[2 rows x 2 columns]
>>> g.get_group(1)['B']
0    2
1    4
Name: B, dtype: int64

So an applied function with apply is applied on all columns:

>>> df.groupby('A', as_index=False)['B'].apply(lambda x: x.cumsum())
   A  B
0  1  2
1  2  6
2  5  6
[3 rows x 2 columns]

With as_index=True it works as expected:

>>> g = df.groupby('A')
>>> g.get_group(1)
   A  B
0  1  2
1  1  4
[2 rows x 2 columns]

>>> g = df.groupby('A')['B']
>>> g.get_group(1)
0    2
1    4
Name: B, dtype: int64

>>> df.groupby('A')['B'].apply(lambda x: x.cumsum())
0    2
1    6
2    6
dtype: int64

A more elaborate example where this turned out:

>>> s="""L1  L2  L3
... X   1   200
... X   2   100
... Z   1   15
... X   3   200
... Z   2   10
... Y   1   1
... Z   3   20
... Y   2   10
... Y   3   100"""
>>> 
>>> df = pd.read_csv(StringIO(s), sep="\s+")
>>> df.groupby("L1")["L3"].apply(lambda x: x.order().cumsum()/x.sum())
L1   
X   1    0.200000
    0    0.600000
    3    1.000000
Y   5    0.009009
    7    0.099099
    8    1.000000
Z   4    0.222222
    2    0.555556
    6    1.000000
dtype: float64

But if I don't want the X, Y, Z in the index:

>>> df.groupby("L1", as_index=False)["L3"].apply(lambda x: x.order().cumsum()/x.sum())

return an error as x is a dataframe.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions