Description
Hello!
I apologize if this expected behavior. This is relatively similar to this StackOverflow question.
Code Sample, a copy-pastable example if possible
import pandas as pd
x = pd.Categorical(['apples', 'dairy', 'chicken', 'beef', 'apples', 'dairy', 'chicken'], categories=['apples', 'dairy', 'beef', 'chicken'])
y = pd.Series([1, 2, 1, 2, 1, 2, 1])
z = pd.Series([3, 4, 2, 1, 3, 2, 1])
df = pd.DataFrame({'z': z, 'x': x, 'y':y})
df.set_index(['x', 'y']).sort_index()
df.sort_values('x')
Problem description
I would like to sort and group-by a column in a custom way. In the example above, I've ordered a categorical (it could be a string) in a way that makes intuitive sense. In this example, I want fruits first, followed by dairy, followed by meats.
Expected Output
When the categorical is in a MultiIndex, set_index
seems to coerce the categorical to a string before adding it to the index. It would be nicer if pandas kept the categorical ordering for the index.
Output of pd.show_versions()
pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.1
statsmodels: 0.8.0.dev0+7e6b94b
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.9.None
psycopg2: None
jinja2: 2.8
boto: 2.42.0
pandas_datareader: None