Skip to content

Groupby aggregations could ignore non-numeric columns when axis=1 #3688

Closed
@hayd

Description

@hayd

Perhaps the following groupby aggregation should work only the numeric columns, as they would when using the dataframe:

In [1]: df = pd.DataFrame({'bar': {0: 1, 1: 1, 2: 1}, 'foo': {0: 0, 1: 1, 2: 2}, 'foo1': {0: 1, 1: 2, 2: 3}, 'hello': {0: 'a', 1: 'a', 2: 'a'}}, columns=['bar', 'foo', 'foo', 'hello'])

In [2]: df
Out[2]:
   bar  foo  foo hello
0    1    0    1     a
1    1    1    2     a
2    1    2    3     a

In [3]: df.mean()  # hello is ignored
Out[13]:
bar    1
foo    1
foo    2
dtype: float64

In [4]: df.groupby(level=0, axis=1).mean()
---------------------------------------------------------------------------
DataError                                 Traceback (most recent call last)
<ipython-input-4-7c2612a8fbda> in <module>()
----> 1 df.groupby(level=0, axis=1).mean()

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/groupby.pyc in mean(self)
    351         """
    352         try:
--> 353             return self._cython_agg_general('mean')
    354         except GroupByError:
    355             raise

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/groupby.pyc in _cython_agg_general(self, how, numeric_only)
   1569
   1570     def _cython_agg_general(self, how, numeric_only=True):
-> 1571         new_blocks = self._cython_agg_blocks(how, numeric_only=numeric_only)
   1572         return self._wrap_agged_blocks(new_blocks)
   1573

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/groupby.pyc in _cython_agg_blocks(self, how, numeric_only)
   1616
   1617         if len(new_blocks) == 0:
-> 1618             raise DataError('No numeric types to aggregate')
   1619
   1620         return new_blocks

DataError: No numeric types to aggregate

From this SO question, where I gave very hacky workaround.

cc #3683 @jreback was this the question you were talking about? This ones related but in the sense of coming up against non unique problems... Thought I should mention it here anyway.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugGroupbyNuisance ColumnsIdentifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.applyNumeric OperationsArithmetic, Comparison, and Logical operations

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions