Skip to content

Index([1,2,.np.nan]).get_indexer([np.nan]) returns wrong value? #7820

Closed
@jankatins

Description

@jankatins

When this is merged, make the below change in Categorical as well!

Having np.nan in an Index is not returning the position of NaN but -1:

In[5]: from pandas.core.index import _ensure_index
In[6]: import numpy as np
In[7]: idx = _ensure_index([1,2,3,4, np.nan])
In[8]: np.nan in idx
Out[8]: True
In[9]: idx.get_indexer([np.nan])
Out[9]: array([-1])

I'm not sure if that's a bug or intended. What happens is that this (new) test for Categoricals fails:

        # if nan in levels, the proper code should be set!
        cat = pd.Categorical([1,2,3, np.nan], levels=[1,2,3])
        cat.levels = [1,2,3, np.nan]
        cat[1] = np.nan
        exp = np.array([0,3,2,-1])
        self.assert_numpy_array_equal(cat.codes, exp)

Traceback (most recent call last):
  File "C:\data\external\pandas\pandas\tests\test_categorical.py", line 555, in test_set_item_nan
    self.assert_numpy_array_equal(cat.codes, exp)
  File "C:\data\external\pandas\pandas\util\testing.py", line 99, in assert_numpy_array_equal
    raise AssertionError('{0} is not equal to {1}.'.format(np_array, assert_equal))
AssertionError: [ 0 -1  2 -1] is not equal to [ 0  3  2 -1].

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions