Skip to content

BUG: Slicing operator has a regression on Python 3.12 for a dataframe with categorical MultiIndex  #58604

Closed
@tvalentyn

Description

@tvalentyn

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'), ('two', 'a'),
                                   ('two', 'b')])
index = index.set_levels(
    index.levels[0].astype(pd.CategoricalDtype(['one', 'two'])), level=0)
index = index.set_levels(
    index.levels[1].astype(pd.CategoricalDtype(['a', 'b'])), level=1)
s = pd.Series(np.arange(1.0, 5.0), index=index)
df = s.unstack(level=0)

print(df[0:2])

Issue Description

When Python version is 3.12 or above AND Pandas version is 2.1.0 or above, the provided code snippet errors out:

Traceback (most recent call last):
  File "/home/valentyn/.pyenv/versions/py312a/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3790, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 152, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 181, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: slice(0, 2, None)

Expected Behavior

On Python 3.11, or on Pandas 2.0.x, the output is as follows:

   one  two
a  1.0  3.0
b  2.0  4.0

Installed Versions

INSTALLED VERSIONS ------------------ commit : d9cdd2e python : 3.12.3.final.0 python-bits : 64 OS : Linux OS-release : 6.6.15-2rodete2-amd64 Version : #1 SMP PREEMPT_DYNAMIC Debian 6.6.15-2rodete2 (2024-03-19) machine : x86_64 processor : byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : C.UTF-8

pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : 69.5.1
pip : 24.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions