Skip to content

BUG: __getitem__ on MultiIndex with empty slice raises #15454

Closed
@TomAugspurger

Description

@TomAugspurger

Code Sample, a copy-pastable example if possible

# Your code here
In [26]: df = pd.DataFrame(0, index=range(2), columns=pd.MultiIndex.from_product([[1], [2]]))
    ...:
    ...:

In [27]: df
Out[27]:
   1
   2
0  0
1  0

In [28]: df[[]]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-28-13353c6d99d3> in <module>()
----> 1 df[[]]

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/frame.py in __getitem__(self, key)
   2014         if isinstance(key, (Series, np.ndarray, Index, list)):
   2015             # either boolean or fancy integer index
-> 2016             return self._getitem_array(key)
   2017         elif isinstance(key, DataFrame):
   2018             return self._getitem_frame(key)

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/frame.py in _getitem_array(self, key)
   2058             return self.take(indexer, axis=0, convert=False)
   2059         else:
-> 2060             indexer = self.loc._convert_to_indexer(key, axis=1)
   2061             return self.take(indexer, axis=1, convert=True)
   2062

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
   1216                 # this is not the most robust, but...
   1217                 if (isinstance(labels, MultiIndex) and
-> 1218                         not isinstance(objarr[0], tuple)):
   1219                     level = 0
   1220                     _, indexer = labels.reindex(objarr, level=level)

IndexError: index 0 is out of bounds for axis 0 with size 0

Problem description

This is inconsistent with our other indexers, which all return empty DataFrames / series

  1. Regular index in the columns or index
In [30]: df.T[[]]
Out[30]:
Empty DataFrame
Columns: []
Index: [(1, 2)]
In [32]: df.loc[[]]
Out[32]:
Empty DataFrame
Columns: [(1, 2)]
Index: []
  1. MultiIndex in the index with .loc
In [34]: df.T.loc[[]]
Out[34]:
Empty DataFrame
Columns: [0, 1]
Index: []

A probably related bug: while df.loc[:, []] works as expected, __setitem__ raises with the same error:

In [65]: df.loc[:, []]  # OK
Out[65]:
Empty DataFrame
Columns: []
Index: [0, 1]
In [66]: df.loc[:, []] = 10  # should be a no-op really
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-66-6cd48eddd6a3> in <module>()
----> 1 df.loc[:, []] = 10

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in __setitem__(self, key, value)
    146         else:
    147             key = com._apply_if_callable(key, self.obj)
--> 148         indexer = self._get_setitem_indexer(key)
    149         self._setitem_with_indexer(indexer, value)
    150

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _get_setitem_indexer(self, key)
    125         if isinstance(key, tuple):
    126             try:
--> 127                 return self._convert_tuple(key, is_setter=True)
    128             except IndexingError:
    129                 pass

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _convert_tuple(self, key, is_setter)
    192                 if i >= self.obj.ndim:
    193                     raise IndexingError('Too many indexers')
--> 194                 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
    195                 keyidx.append(idx)
    196         return tuple(keyidx)

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
   1216                 # this is not the most robust, but...
   1217                 if (isinstance(labels, MultiIndex) and
-> 1218                         not isinstance(objarr[0], tuple)):
   1219                     level = 0
   1220                     _, indexer = labels.reindex(objarr, level=level)

IndexError: index 0 is out of bounds for axis 0 with size 0

Expected Output

In [47]: pd.DataFrame([], index=[0, 1])
Out[47]:
Empty DataFrame
Columns: []
Index: [0, 1]

I think that's right, all the columns should be dropped.

Output of pd.show_versions()

# Paste the output here pd.show_versions() here

INSTALLED VERSIONS

commit: 0ceb40f
python: 3.6.0.final.0
python-bits: 64
OS: Darwin
OS-release: 16.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.19.0+466.g0ceb40fde.dirty
pytest: 3.0.5
pip: 9.0.1
setuptools: 32.3.0
Cython: 0.25.2
numpy: 1.12.0
scipy: 0.18.1
xarray: None
IPython: 5.2.2
sphinx: 1.5.2
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.6.1
feather: None
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: 0.0.8
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions