Skip to content

ENH/API: clarify groupby by to handle columns/index names #5677

Closed
@TomAugspurger

Description

@TomAugspurger

Referenced briefly in the OP at #3275

In [11]: idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3)])

In [12]: idx.names = ['outer', 'inner']

In [13]: df = pd.DataFrame({"A": np.arange(6), 'B': ['one', 'one', 'two', 'two', 'one', 'one']}, index=idx)

So the idea is to be able to call

df.groupby('B', level='inner')

instead of

In [15]: df.reset_index().groupby(['B', 'inner']).mean()
Out[15]: 
             A
B   inner     
one 1      0.0
    2      2.5
    3      5.0
two 1      3.0
    3      2.0

[5 rows x 1 columns]

Currently this raises TypeError: 'numpy.ndarray' object is not callable. Mostly just syntactic sugar, but I've been having to do a lot of this lately and all the reset_indexes are getting annoying. Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions