Skip to content

COMPAT: clarify Index integer conversions when dtype is specified in construction #15187

Closed
@jreback

Description

@jreback

xref #15162

so [8], and [9] are our current model, IOW, we make an effort to convert to the specified type, but will coerce to an available type if the data is not valid for that dtype.

ideally we would also be consistent w.r.t. #15145, IOW Series construction with a specified dtype (not that we upcast to the available Index types but don't do this for Series, e.g. [18])

So we should be consistent on this.

In [8]: Index([np.nan],dtype='int64')
Out[8]: Float64Index([nan], dtype='float64')

In [9]: Index([np.nan],dtype='uint64')
Out[9]: Float64Index([nan], dtype='float64')

In [10]: Index([np.iinfo(np.int64).max-1],dtype='int64')
Out[10]: Int64Index([9223372036854775806], dtype='int64')

In [11]: Index([np.iinfo(np.int64).max-1],dtype='uint64')
Out[11]: UInt64Index([9223372036854775806], dtype='uint64')

# I guess this should convert to Float64Index?
In [12]: Index([np.iinfo(np.uint64).max-1],dtype='int64')
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-12-f2ec7d3c38a4> in <module>()
----> 1 Index([np.iinfo(np.uint64).max-1],dtype='int64')

/Users/jreback/pandas/pandas/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    318             # other iterable of some kind
    319             subarr = _asarray_tuplesafe(data, dtype=object)
--> 320             return Index(subarr, dtype=dtype, copy=copy, name=name, **kwargs)
    321 
    322     """

/Users/jreback/pandas/pandas/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    199                         inferred = lib.infer_dtype(data)
    200                         if inferred == 'integer':
--> 201                             data = np.array(data, copy=copy, dtype=dtype)
    202                         elif inferred in ['floating', 'mixed-integer-float']:
    203 

OverflowError: Python int too large to convert to C long

In [13]: Index([np.iinfo(np.uint64).max-1],dtype='uint64')
Out[13]: UInt64Index([18446744073709551614], dtype='uint64')

In [14]: Index([-1], dtype='int64')
Out[14]: Int64Index([-1], dtype='int64')

# this looks wrong
In [15]: Index([-1], dtype='uint64')
Out[15]: UInt64Index([18446744073709551615], dtype='uint64')

# we do this type of same-dtype upcasting already (this is correct / good thing)
In [18]: Index([np.iinfo(np.int32).max+1], dtype='int64')
Out[18]: Int64Index([2147483648], dtype='int64')

Metadata

Metadata

Assignees

No one assigned

    Labels

    Compatpandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions