Description
As the first step of moving towards integer-na dtypes as the primary integer type, we need to teach infer_dtype
that integer-na
is a valid inferred type, right now
In [1]: from pandas.api.types import infer_dtype
In [3]: infer_dtype([2, 3,4], skipna=False)
Out[3]: 'integer'
In [4]: infer_dtype([2, 3, 4, np.nan], skipna=False)
Out[4]: 'mixed-integer-float'
In [5]: infer_dtype([2, 3, 4.2, np.nan], skipna=False)
Out[5]: 'mixed-integer-float'
[4] could return 'integer-na' to indicate that we might want to infer Int64
dtype and is distinct from the inferred type of [5] which must become float64
.
This will allow us to then support changing integer columns when we add nulls to Int64 rather than coerce to float64; this is pretty common in indexing setting operations.
Secondly we can then enable .to_numeric
to infer to integer-na (or unsigned-na) and the corresponding dtypes (#26272).
Finally we could support coercion of object
dtypes from integers and nulls to coerce to Int64 (#27267 for .explode()
and .infer_objects()
This issue itself only is a very minor user facing change (infer_dtype itself).