Skip to content

BUG: sort_index/sortlevel fails MultiIndex after columns are added. #8017

Closed
@8one6

Description

@8one6

I have a DataFrame with a MultiIndex on the columns. The first level of the MultiIndex contains strings. The second, floats (though the problem persists if the second level is ints). I add a column to the DataFrame (which should not come last if the columns are sorted). I try to sort the DataFrame. The result does not seem to be sorted. The behavior is fine if the columns are simply an Index (even after adding columns). And the sort works fine in the MultiIndex case as long as no columns have been added since the DataFrame was created.

MWE:

import pandas as pd
import numpy as np

np.random.seed(0)
data = np.random.randn(3,4)

df_multi_float = pd.DataFrame(data, index=list('def'), columns=pd.MultiIndex.from_tuples([('red', i) for i in [1., 3., 2., 5.]]))

print df_multi_float

#OUTPUT
        red                              
          1         3         2         5
d  1.764052  0.400157  0.978738  2.240893
e  1.867558 -0.977278  0.950088 -0.151357
f -0.103219  0.410599  0.144044  1.454274

This sorts just fine as it isnow:

print df_multi_float.sort_index(axis=1)

#OUTPUT
        red                              
          1         2         3         5
d  1.764052  0.978738  0.400157  2.240893
e  1.867558  0.950088 -0.977278 -0.151357
f -0.103219  0.144044  0.410599  1.454274

But if I add columns to both this `DataFrame and then show it sorted, I get what looks to be a wrong result (the new column remains last, rather than being placed second-to-last as it should be):

df_multi_float[('red', 4.0)] = 'world'

print df_multi_float.sort_index(axis=1)

#OUTPUT
        red                                  red
          1         2         3         5      4
d  1.764052  0.978738  0.400157  2.240893  world
e  1.867558  0.950088 -0.977278 -0.151357  world
f -0.103219  0.144044  0.410599  1.454274  world

I'm able to produce this behavior on two systems. The first runs Pandas 0.14.0 and Numpy 1.8.1 and the second runs Pandas 0.14.1 and Numpy 1.8.2. This issue is described here: http://stackoverflow.com/questions/25287130/pandas-sort-index-fails-with-multiindex-containing-floats-as-one-level-when-col?noredirect=1#comment39408150_25287130

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions