Open
Description
This minimal example assigns the column B the first value of each group:
df = pd.DataFrame(data=np.asarray([[1,1,1,2],[4,4,4,10]]).transpose(),columns = ['A','B'])
print(df)
dfagg = df.groupby('A').agg({'B' : lambda x : x.iloc[0]})
print(dfagg)
Hence
A B
0 1 4
1 1 4
2 1 4
3 2 10
Becomes:
A
1 4
2 10
If I now modify the value of column B to an array of DataFrames
df1 = pd.DataFrame({'x' : [0,1], 'y' : [-1,-2]})
df2 = pd.DataFrame({'x' : [0,3], 'y' : [11,12]})
df = pd.DataFrame(data=np.asarray([[1,1,1,2],[df1,df1,df1,df2]],dtype=object).transpose(),columns = ['A','B'])
print(df)
dfagg = df.groupby('A').agg({'B' : lambda x : x.iloc[0]})
print(dfagg)
I get the error:
ValueError: Must produce aggregated value
It seems like this is a bug. There is no logic behind this behavior.
I also think that it did not raise an error in pandas like 2 years ago, when I used the code snippet the last time.
Currently using the newest version 1.2.1