Skip to content

Cumulative Methods with mixed frames casts to object #19296

Closed
@TomAugspurger

Description

@TomAugspurger

Looks like we cast the entire frame to object dtype, and then perform the cumulative functions:

In [18]: x = pd.DataFrame({
    ...:     "A": [1, 2, 3],
    ...:     "B": [1, 2, 3.],
    ...:     "C": [True, False, False],
    ...: })

In [19]: x.cumsum()
Out[19]:
   A  B     C
0  1  1  True
1  3  3     1
2  6  6     1

In [20]: x.cumsum().dtypes
Out[20]:
A    object
B    object
C    object
dtype: object

In [21]: x.cummin().dtypes
Out[21]:
A    object
B    object
C    object
dtype: object

I think it'd be better to do these block-wise? The possible downside I see is that

In [24]: x[['A', 'B']].cumsum()
Out[24]:
     A    B
0  1.0  1.0
1  3.0  3.0
2  6.0  6.0

Will now be slower (presumably) since we'll have two cumsums to apply instead of one (after upcasting), but I think that'd be worth it for preserving the correct dtypes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffNumeric OperationsArithmetic, Comparison, and Logical operations

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions