Skip to content

BUG: regression on master in groupby agg with ExtensionArray #29141

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

Example that I could make with DecimalArray:

In [1]: from pandas.tests.extension.decimal import DecimalArray, make_data 

In [2]: df = pd.DataFrame({'id': [0,0,0,1,1], 'decimals': DecimalArray(make_data()[:5])}) 

In [3]: df.groupby('id')['decimals'].agg(lambda x: x.iloc[0]) 
Out[8]: 
id
0      0.831922765262135044395108707249164581298828125
1    0.40839445887803604851029604105860926210880279...
dtype: object

On master of a few days ago, the above returned 'decimal' dtype instead of object dtype.

Found this in the geopandas test suite, as there it creates invalid output and then an error in a follow-up operation (https://travis-ci.org/geopandas/geopandas/jobs/600859374)

This seems to be caused by #29088, and specifically the change in agg_series: https://github.com/pandas-dev/pandas/pull/29088/files#diff-8c0985a9fca770c2028bed688dfc043fR653-R666
The self._aggregate_series_fast is giving a "AttributeError: 'DecimalArray' object has no attribute 'flags'" error if the series is backed by an EA, and the AttributeError is no longer catched.

cc @jbrockmendel

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapExtensionArrayExtending pandas with custom dtypes or arrays.GroupbyRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions