Skip to content

COMPAT: create UInt64Block #15145

Closed
Closed
@jreback

Description

@jreback

xref #14937 (comment)

a number of indexing / conversion issues arise because we are treating uint as a direct int, rather than a sub-class. (e.g. if we make UIntBlock a sub-class of IntBlock), I think can easily handle some small overrides to, for instance check for negative values when indexing.

In [1]: df = pd.DataFrame({'A' : np.array([1,2,3],dtype='uint64'), 'B': range(3)})

In [2]: df
Out[2]: 
   A  B
0  1  0
1  2  1
2  3  2

In [4]: df.dtypes
Out[4]: 
A    uint64
B     int64
dtype: object

Buggy

In [5]: df.iloc[1] = -1

In [6]: df
Out[6]: 
                      A  B
0                     1  0
1  18446744073709551615 -1
2                     3  2
In [7]: df.iloc[1] = np.nan

In [8]: df
Out[8]: 
     A    B
0  1.0  0.0
1  NaN  NaN
2  3.0  2.0

This is correct

In [9]: df.A.astype('uint64')
---------------------------------------------------------------------------
ValueError: Cannot convert non-finite values (NA or inf) to integer

However, this is not

In [10]: df.iloc[1] = -1

In [11]: df
Out[11]: 
     A    B
0  1.0  0.0
1 -1.0 -1.0
2  3.0  2.0

In [12]: df.dtypes
Out[12]: 
A    float64
B    float64
dtype: object

In [13]: df.A.astype('uint64')
Out[13]: 
0                       1
1    18446744073709551615
2                       3
Name: A, dtype: uint64

Construction with invalid values

In [1]: Series([-1], dtype='uint64')
Out[1]: 
0    18446744073709551615
dtype: uint64

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCompatpandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions