Open
Description
Follow-up on #39260 (comment)
Currently, an "accumulate" ufunc is applied on the full DataFrame at once, with the consequence that it doesn't preserve dtypes if you have mixed numeric columns, eg:
In [4]: df = pd.DataFrame({"a": [1, 3, 2, 4], "b": [0.1, 4.0, 3.0, 2.0]})
In [5]: df
Out[5]:
a b
0 1 0.1
1 3 4.0
2 2 3.0
3 4 2.0
In [6]: np.maximum.accumulate(df)
Out[6]:
a b
0 1.0 0.1
1 3.0 4.0
2 3.0 4.0
3 4.0 4.0
It is certainly possible for the default case (corresponding to .accumulate(axis=0)
) to apply this ufunc on each column or block, to preserve the column dtypes. When axis=1
is passed to the ufunc this is not possible.
See at the linked PR discussion above for some more details at what is involved to implement this.