Skip to content

ENH: infer_dtype should infer integer-na #27283

Closed
@jreback

Description

@jreback

xref #26272
xref #27267

As the first step of moving towards integer-na dtypes as the primary integer type, we need to teach infer_dtype that integer-na is a valid inferred type, right now

In [1]: from pandas.api.types import infer_dtype                                                                                                                                                                                                             

In [3]: infer_dtype([2, 3,4], skipna=False)                                                                                                                                                                                                                  
Out[3]: 'integer'

In [4]: infer_dtype([2, 3, 4, np.nan], skipna=False)                                                                                                                                                                                                         
Out[4]: 'mixed-integer-float'

In [5]: infer_dtype([2, 3, 4.2, np.nan], skipna=False)                                                                                                                                                                                                       
Out[5]: 'mixed-integer-float'

[4] could return 'integer-na' to indicate that we might want to infer Int64 dtype and is distinct from the inferred type of [5] which must become float64.

This will allow us to then support changing integer columns when we add nulls to Int64 rather than coerce to float64; this is pretty common in indexing setting operations.

Secondly we can then enable .to_numeric to infer to integer-na (or unsigned-na) and the corresponding dtypes (#26272).

Finally we could support coercion of object dtypes from integers and nulls to coerce to Int64 (#27267 for .explode() and .infer_objects()

This issue itself only is a very minor user facing change (infer_dtype itself).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Dtype ConversionsUnexpected or buggy dtype conversionsExtensionArrayExtending pandas with custom dtypes or arrays.Missing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions