Skip to content

BUG: MultiIndex Slice returns incorrect slice (compared to .xs()) #27591

Closed
@groutr

Description

@groutr

Code Sample, a copy-pastable example if possible

import pandas as pd
l1 = list('abc')
l2 = [(0, 1), (1, 0)]
l3 = [0, 1]
cols = pd.MultiIndex.from_product([l1, l2, l3], names=['x', 'y', 'z'])
df = pd.DataFrame(index=range(5), columns=cols)

# This returns the expected slice
expected = df.xs((l1[0], l2[0], l3[0]), level=[0, 1, 2], axis=1)
print(expected)

# This returns an empty Series (?!)
empty1 = df.loc[:, (l1[0], l2[0], l3[0])]
# This returns an empty Series also
empty2 = df.loc[:, pd.IndexSlice[l1[0], l2[0], l3[0]]]
print(empty1)
print(empty2)

# if we omit the l2 dim on slice, we get something back
try3 = df.loc[:, pd.IndexSlice[l1[0], :, l3[0]]]
print(try3)

# Oddly enough, we can still set values on the dataframe and it works as expected.
df.loc[:, (l1[0], l2[0], l3[0])] = list(range(10, 15))
print(df)

Problem description

As far as I can understand the, all three indexing operations should provide the same result. Slicing seems to be the favored approach in the documentation, yet it doesn't return the expected result in this case.

Wrapping the tuple in a python object doesn't seem to work either, returning a KeyError.

class MyTuple(object):
    def __init__(self, data):
        self.data = data
    def __eq__(self, other):
        return self.data == other.data
    def __hash__(self):
        return hash(self.data)
    def __repr__(self):
        return repr(self.data)
    def __str__(self):
        return repr(self)

l22 = [MyTuple((0,1)), MyTuple((1, 0))]
df2 = pd.DataFrame(index=range(5), columns=pd.MultiIndex.from_product([l1, l22, l3]))
df.loc[:, pd.IndexSlice[l1[0], l22[0], l3[0]]]

Expected Output

The expected result is a single series with 5 NaN values:

       a                                                                                                                                                          
  (0, 1)                                                                                                                                                          
       0                                                                                                                                                          
0    NaN                                                                                                                                                          
1    NaN                                                                                                                                                          
2    NaN                                                                                                                                                          
3    NaN                                                                                                                                                          
4    NaN

This is the value of expected in the example above.

Output of pd.show_versions()

Note that pandas 0.24.2 is being used because compatibility with Python 2 is needed.

INSTALLED VERSIONS

commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-696.10.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 4.3.1
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.2
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: 1.7.0
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: 1.3.1
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions