Skip to content

ValueError: Must produce aggregated value, if DataFrame contains a DataFrame as an element #39436

Open
@Jogala

Description

@Jogala

This minimal example assigns the column B the first value of each group:

df = pd.DataFrame(data=np.asarray([[1,1,1,2],[4,4,4,10]]).transpose(),columns = ['A','B'])
print(df)
dfagg = df.groupby('A').agg({'B' : lambda x : x.iloc[0]})
print(dfagg)

Hence

   A   B
0  1   4
1  1   4
2  1   4
3  2  10

Becomes:

A    
1   4
2  10

If I now modify the value of column B to an array of DataFrames

df1 = pd.DataFrame({'x' : [0,1], 'y' : [-1,-2]})
df2 = pd.DataFrame({'x' : [0,3], 'y' : [11,12]})

df = pd.DataFrame(data=np.asarray([[1,1,1,2],[df1,df1,df1,df2]],dtype=object).transpose(),columns = ['A','B'])
print(df)
dfagg = df.groupby('A').agg({'B' : lambda x : x.iloc[0]})
print(dfagg)

I get the error:

ValueError: Must produce aggregated value

It seems like this is a bug. There is no logic behind this behavior.
I also think that it did not raise an error in pandas like 2 years ago, when I used the code snippet the last time.
Currently using the newest version 1.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapBugGroupbyNested DataData where the values are collections (lists, sets, dicts, objects, etc.).

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions