Description
Code Sample, a copy-pastable example if possible
In [2]: df = pd.DataFrame(np.arange(16).reshape(4, 4), index=pd.MultiIndex.from_product([[1, 2], ['a', 'b']]), columns=['a', 'b', 'c', 'd'])
In [3]: df.loc[2, 'a'] # select a row: good
Out[3]:
a 8
b 9
c 10
d 11
Name: (2, a), dtype: int64
In [4]: df.loc[2, 'c'] # select a (part of) col: guessing game, but I understand it is a feature
Out[4]:
a 10
b 14
Name: c, dtype: int64
In [5]: df.loc[2, 'e'] = -1 # now there is no column: add a row?
In [6]: df # ... nope, still adds a column
Out[6]:
a b c d e
1 a 0 1 2 3 NaN
b 4 5 6 7 NaN
2 a 8 9 10 11 -1.0
b 12 13 14 15 -1.0
In [7]: df.loc[3, 'f'] = -2 # what if the row label is entirely missing?
In [8]: df # sitll adds a row _and_ a col
Out[8]:
a b c d e f
1 a 0.0 1.0 2.0 3.0 NaN NaN
b 4.0 5.0 6.0 7.0 NaN NaN
2 a 8.0 9.0 10.0 11.0 -1.0 NaN
b 12.0 13.0 14.0 15.0 -1.0 NaN
3 NaN NaN NaN NaN NaN -2.0
Problem description
In general, if df.index
is a MultiIndex
, pandas interprets the syntax df.loc[a, b]
as df.loc[(a,b),:]
.
Out[4]:
is (debatable, but) understandable: in absence of the desired row, and in presence of a column with the same name, it interprets as df.loc[(a,), b]
.
However, there is no reason why Out[5]:
and Out[6]:
should add a column: since priority when labels are present goes to the index, the same should happen when labels are absent.
Somewhat related to #17024 .
Expected Output
In [8]: df
Out[8]:
a b c d
1 a 0.0 1.0 2.0 3.0
b 4.0 5.0 6.0 7.0
2 a 8.0 9.0 10.0 11.0
b 12.0 13.0 14.0 15.0
e -1.0 -1.0 -1.0 -1.0
3 f -2.0 -2.0 -2.0 -2.0
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8
pandas: 0.23.0.dev0+42.g93033151a
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.7.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1