Description
Code Sample, a copy-pastable example if possible
import pandas as pd
l1 = list('abc')
l2 = [(0, 1), (1, 0)]
l3 = [0, 1]
cols = pd.MultiIndex.from_product([l1, l2, l3], names=['x', 'y', 'z'])
df = pd.DataFrame(index=range(5), columns=cols)
# This returns the expected slice
expected = df.xs((l1[0], l2[0], l3[0]), level=[0, 1, 2], axis=1)
print(expected)
# This returns an empty Series (?!)
empty1 = df.loc[:, (l1[0], l2[0], l3[0])]
# This returns an empty Series also
empty2 = df.loc[:, pd.IndexSlice[l1[0], l2[0], l3[0]]]
print(empty1)
print(empty2)
# if we omit the l2 dim on slice, we get something back
try3 = df.loc[:, pd.IndexSlice[l1[0], :, l3[0]]]
print(try3)
# Oddly enough, we can still set values on the dataframe and it works as expected.
df.loc[:, (l1[0], l2[0], l3[0])] = list(range(10, 15))
print(df)
Problem description
As far as I can understand the, all three indexing operations should provide the same result. Slicing seems to be the favored approach in the documentation, yet it doesn't return the expected result in this case.
Wrapping the tuple in a python object doesn't seem to work either, returning a KeyError.
class MyTuple(object):
def __init__(self, data):
self.data = data
def __eq__(self, other):
return self.data == other.data
def __hash__(self):
return hash(self.data)
def __repr__(self):
return repr(self.data)
def __str__(self):
return repr(self)
l22 = [MyTuple((0,1)), MyTuple((1, 0))]
df2 = pd.DataFrame(index=range(5), columns=pd.MultiIndex.from_product([l1, l22, l3]))
df.loc[:, pd.IndexSlice[l1[0], l22[0], l3[0]]]
Expected Output
The expected result is a single series with 5 NaN values:
a
(0, 1)
0
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
This is the value of expected
in the example above.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-696.10.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.2
pytest: 4.3.1
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.2
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: 1.7.0
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: 1.3.1
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None