Skip to content

Series.is_unique has errors on objects with __ne__ defined #20661

Closed
@Dr-Irv

Description

@Dr-Irv

Code Sample, a copy-pastable example if possible

In [2]: import pandas as pd
   ...:
   ...: class Foo(object):
   ...:     def __init__(self, val):
   ...:         self._value = val
   ...:
   ...:     def __ne__(self, other):
   ...:         raise Exception("NEQ not supported")
   ...:

In [3]: li = [Foo(i) for i in range(5)]

In [4]: s = pd.Series(li, index=[i for i in range(5)])

In [5]: s.is_unique
Exception ignored in: 'util._checknan'
Traceback (most recent call last):
  File "<ipython-input-2-91ad489145cd>", line 8, in __ne__
Exception: NEQ not supported
Exception ignored in: 'util._checknan'
Traceback (most recent call last):
  File "<ipython-input-2-91ad489145cd>", line 8, in __ne__
Exception: NEQ not supported
Exception ignored in: 'util._checknan'
Traceback (most recent call last):
  File "<ipython-input-2-91ad489145cd>", line 8, in __ne__
Exception: NEQ not supported
Exception ignored in: 'util._checknan'
Traceback (most recent call last):
  File "<ipython-input-2-91ad489145cd>", line 8, in __ne__
Exception: NEQ not supported
Exception ignored in: 'util._checknan'
Traceback (most recent call last):
  File "<ipython-input-2-91ad489145cd>", line 8, in __ne__
Exception: NEQ not supported
Out[5]: True

Problem description

I'm working with a third party library that has a class that, for good reasons, raises an Exception when the method __ne__ is called. I want to put that class into a pandas Series. Eventually, I hope to do this with the new ExtensionArray feature. Anyhow, I uncovered an issue in my debugger (pydev) which was throwing an exception when I looked into the corresponding Series.

So if you have an object that defines __ne__, then is_unique will fail (as shown above). I can't figure out where util._checknan is being called, but I think it is deep in the cython area.

I think this will need to work correctly if I want to use ExtensionArray to hold these kinds of objects.

Expected Output

Out[5]: True

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 402ad45
python: 3.6.4.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.0.dev0+683.g402ad45da.dirty
pytest: 3.4.0
pip: 9.0.1
setuptools: 38.5.1
Cython: 0.25.1
numpy: 1.14.1
scipy: 1.0.0
pyarrow: 0.8.0
xarray: None
IPython: 6.2.1
sphinx: 1.7.1
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.3
blosc: 1.5.1
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.2.0
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.5
pymysql: 0.8.0
psycopg2: None
jinja2: 2.10
s3fs: 0.1.3
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Compatpandas objects compatability with Numpy or Python functionsIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions