Skip to content

BUG: maximum of pd.Series([np.nan],dtype=ordered_category) raise #33450

Closed
@mizuy

Description

@mizuy
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

In [1]: pd.__version__
Out[1]: '1.0.3'

In [2]: pd.Series([np.nan],dtype=pd.CategoricalDtype([0,1],ordered=True)).max()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-5a47d189b696> in <module>
----> 1 pd.Series([np.nan],dtype=pd.CategoricalDtype([0,1],ordered=True)).max()

~/.pyenv/versions/3.7.4/lib/python3.7/site-packages/pandas/core/generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
  11213             return self._agg_by_level(name, axis=axis, level=level, skipna=skipna)
  11214         return self._reduce(
> 11215             f, name, axis=axis, skipna=skipna, numeric_only=numeric_only
  11216         )
  11217

~/.pyenv/versions/3.7.4/lib/python3.7/site-packages/pandas/core/series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   3870
   3871         if isinstance(delegate, Categorical):
-> 3872             return delegate._reduce(name, skipna=skipna, **kwds)
   3873         elif isinstance(delegate, ExtensionArray):
   3874             # dispatch to ExtensionArray interface

~/.pyenv/versions/3.7.4/lib/python3.7/site-packages/pandas/core/arrays/categorical.py in _reduce(self, name, axis, **kwargs)
   2123         if func is None:
   2124             raise TypeError(f"Categorical cannot perform the operation {name}")
-> 2125         return func(**kwargs)
   2126
   2127     @deprecate_kwarg(old_arg_name="numeric_only", new_arg_name="skipna")

~/.pyenv/versions/3.7.4/lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    212                 else:
    213                     kwargs[new_arg_name] = new_arg_value
--> 214             return func(*args, **kwargs)
    215
    216         return cast(F, wrapper)

~/.pyenv/versions/3.7.4/lib/python3.7/site-packages/pandas/core/arrays/categorical.py in max(self, skipna)
   2188         if not good.all():
   2189             if skipna:
-> 2190                 pointer = self._codes[good].max()
   2191             else:
   2192                 return np.nan

~/.pyenv/versions/3.7.4/lib/python3.7/site-packages/numpy/core/_methods.py in _amax(a, axis, out, keepdims, initial, where)
     28 def _amax(a, axis=None, out=None, keepdims=False,
     29           initial=_NoValue, where=True):
---> 30     return umr_maximum(a, axis, None, out, keepdims, initial, where)
     31
     32 def _amin(a, axis=None, out=None, keepdims=False,

ValueError: zero-size array to reduction operation maximum which has no identity

In the older version, the same code didn't raise an error.

In [10]: pd.__version__
Out[10]: '0.25.3'

In [11]: pd.Series([np.nan],dtype=pd.CategoricalDtype([0,1],ordered=True)).max()
    ...:
Out[11]: nan

Problem description

Because of this behavior, I failed to df.groupby().max() for ordered categories.

Expected Output

Expected output should be np.nan

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Darwin
OS-release : 18.7.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8

pandas : 1.0.3
numpy : 1.17.3
pytz : 2019.3
dateutil : 2.8.0
pip : 20.0.2
setuptools : 40.8.0
Cython : None
pytest : 5.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.2.2
lxml.etree : 4.4.2
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : None
odfpy : None
openpyxl : 3.0.0
pandas_gbq : None
pyarrow : 0.15.0
pytables : None
pytest : 5.2.1
pyxlsb : None
s3fs : 0.2.2
scipy : 1.3.1
sqlalchemy : None
tables : None
tabulate : 0.8.5
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : 1.2.2
numba : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions