Closed
Description
Dropping rows is removing the values from a multiindex dataframe but not removing the key values from the multiindex. I don't think this is intended behavior. This is pandas 15.1
In [33]: s = pd.Series(np.random.randn(8), index=arrays)
In [34]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
In [35]: df
Out[35]:
0 1 2 3
bar one -0.236317 0.105585 -0.611251 0.016047
two 0.296703 -2.415282 -0.595308 0.388648
baz one 1.388280 0.261497 0.717756 0.407535
two 0.379969 -0.592787 1.093628 0.082563
foo one 0.100581 -0.186784 -0.150018 -0.032548
two 2.009603 -0.516111 0.801641 0.693063
qux one -0.194570 0.952694 0.913118 0.299275
two 1.716092 0.465544 1.453519 -0.679434
In [36]: df.index
Out[36]:
MultiIndex(levels=[[u'bar', u'baz', u'foo', u'qux'], [u'one', u'two']],
labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1]])
In [37]: df.drop('two', level=1, axis=0,inplace=True)
In [38]: df
Out[38]:
0 1 2 3
bar one -0.236317 0.105585 -0.611251 0.016047
baz one 1.388280 0.261497 0.717756 0.407535
foo one 0.100581 -0.186784 -0.150018 -0.032548
qux one -0.194570 0.952694 0.913118 0.299275
In [39]: df.index
Out[39]:
MultiIndex(levels=[[u'bar', u'baz', u'foo', u'qux'], [u'one', u'two']],
labels=[[0, 1, 2, 3], [0, 0, 0, 0]])
I noticed this because when I then try to create a panel from from the dataframe I get:
In [40]: panel = df.to_panel()
In [41]: panel
Out[41]:
<class 'pandas.core.panel.Panel'>
Dimensions: 4 (items) x 4 (major_axis) x 2 (minor_axis)
Items axis: 0 to 3
Major_axis axis: bar to qux
Minor_axis axis: one to two
I also noticed this same behavior when I used query on a multiindex dataframe and then tried to create a panel from the result. Is there a way to get around having to drop extra array from the panel?