Skip to content

combine_first not retaining dtypes #7509

Closed
@altaurog

Description

@altaurog

I found a number of issues that seemed related, all closed over a year ago, but there still seem to be some inconsistencies here:

In [1]: from datetime import datetime
In [2]: import pandas as pd
In [3]: pd.__version__
Out[3]: '0.13.1'
In [4]: dfa = pd.DataFrame([[datetime.now(), 2]], columns=['a','b'])
In [5]: dfb = pd.DataFrame([[4],[5]], columns=['b'])
In [6]: dfa.dtypes
Out[6]: 
a    datetime64[ns]
b             int64
dtype: object
In [7]: dfb.dtypes
Out[7]: 
b    int64
dtype: object
In [8]: # int64 becomes float64 when combining the two frames
In [9]: dfa.combine_first(dfb).dtypes
Out[9]: 
a    datetime64[ns]
b           float64
dtype: object
In [10]: # datetime64[ns] becomes float64 if the first frame is empty
In [11]: dfa.iloc[:0].combine_first(dfb).dtypes
Out[11]: 
a    float64
b      int64
dtype: object

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugDtype ConversionsUnexpected or buggy dtype conversionsMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions