Skip to content

DOC: MultiIndex sort docs #13108

Closed
Closed
@max-sixty

Description

@max-sixty

I found this confusing, despite being a moderately competent pandas user:

from http://pandas.pydata.org/pandas-docs/stable/advanced.html#the-need-for-sortedness-with-multiindex

Caveat emptor: the present implementation of MultiIndex requires that the labels be sorted for some of the slicing / indexing routines to work correctly. You can think about breaking the axis into unique groups, where at the hierarchical level of interest, each distinct group shares a label, but no two have the same label. However, the MultiIndex does not enforce this: you are responsible for ensuring that things are properly sorted. There is an important new method sort_index to sort an axis within a MultiIndex so that its labels are grouped and sorted by the original ordering of the associated factor at that level. Note that this does not necessarily mean the labels will be sorted lexicographically!

Is this right, that calling sort_index doesn't guarantee lex sortedness? How to guarantee it then?

And this:

Some indexing will work even if the data are not sorted, but will be rather inefficient and will also return a copy of the data rather than a view:

...seems to contradict this:

Thus, if you try to index at a depth at which the index is not sorted, it will raise an exception.

...neither of which seems tightly consistent with the passage above.

Am I misunderstanding something?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions