Description
Code Sample, a copy-pastable example if possible
import pandas as pd
df = pd.DataFrame({'Date': pd.date_range(start='2016/1/1', periods=3, freq='W'),
'Station':['Kings Cross', 'Sydney', 'Newtown'],
'Hours': ["1PM", "2PM", "3PM"],
'Exit':[10, 30, 50], 'Entry':[0, 60, 20]})
df.Station = df.Station.astype('category')
df.Hours = df.Hours.astype('category')
df1= (df.set_index(['Date','Hours', 'Station'])[['Entry', 'Exit']].
unstack(['Hours', 'Station']).
stack(['Hours', 'Station'], dropna=False).
reset_index())
print(df1.info())
assert df.Hours.dtype == df1.Hours.dtype, "Hours dtype has changed"
assert df.Station.dtype == df1.Station.dtype, "Station dtype has changed"
Problem description
Hours and Station are categories. After the stack/unstack Hours remains a category but Station becomes an object.
If I switch the order of the Station and Hours in unstack the Station remains categorical and Hours becomes an object.
Expected Output
Hours and Station both remain categorical (prefered) or both become the underlying type of.
Output of pd.show_versions()
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.9.4
boto: 2.45.0
pandas_datareader: None