Closed
Description
Looks like we cast the entire frame to object dtype, and then perform the cumulative functions:
In [18]: x = pd.DataFrame({
...: "A": [1, 2, 3],
...: "B": [1, 2, 3.],
...: "C": [True, False, False],
...: })
In [19]: x.cumsum()
Out[19]:
A B C
0 1 1 True
1 3 3 1
2 6 6 1
In [20]: x.cumsum().dtypes
Out[20]:
A object
B object
C object
dtype: object
In [21]: x.cummin().dtypes
Out[21]:
A object
B object
C object
dtype: object
I think it'd be better to do these block-wise? The possible downside I see is that
In [24]: x[['A', 'B']].cumsum()
Out[24]:
A B
0 1.0 1.0
1 3.0 3.0
2 6.0 6.0
Will now be slower (presumably) since we'll have two cumsums to apply instead of one (after upcasting), but I think that'd be worth it for preserving the correct dtypes.