Skip to content

BUG: fillna('') does not replace NaT #11953

Open
@jreback

Description

@jreback

pandas generally tries to coerce values to fit the column dtype, or upcasts the dtype to fit.

For a setting operation this is convenient & I think expected as a user

In [35]: df = DataFrame({'A' : Series(dtype='M8[ns]'), 'B' : Series([np.nan],dtype='object'), 'C' : ['foo'], 'D' : [1]})

In [36]: df
Out[36]:
    A    B    C  D
0 NaT  NaN  foo  1

In [37]: df.dtypes
Out[37]:
A    datetime64[ns]
B            object
C            object
D             int64
dtype: object

In [38]: df.loc[0,'D'] = 1.0

In [39]: df.dtypes
Out[39]:
A    datetime64[ns]
B            object
C            object
D           float64
dtype: object

However for a .fillna (or .replace) operation this might be a bit unexpected. So A was coerced to object dtype, even though it was datetime64[ns].

In [40]: df.fillna('')
Out[40]:
  A B    C  D
0      foo  1

In [41]: df.fillna('').dtypes
Out[41]:
A     object
B     object
C     object
D    float64
dtype: object

So a possibility is to add a keyword errors='raise'|'coerce'|'ignore'. This last behavior would be equiv of errors='coerce'. While skipping this column would be done with errors='coerce'. (and of course raise would raise.

Ideally would have a default of coerce I think (to skip for non-compat values). Any thoughts on this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugDatetimeDatetime data dtypeMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions