Description
Code Sample, a copy-pastable example if possible
import pandas as pd
from enum import Enum
MyEnum = Enum("MyEnum", "A B")
df = pd.DataFrame(columns=pd.MultiIndex.from_product(iterables=[MyEnum, [1, 2]])) # TypeError: 'values' is not ordered, please explicitly specify the categories order by passing in a categories argument.
df = pd.DataFrame(columns=pd.MultiIndex.from_product(iterables=[pd.Series(MyEnum, dtype="category"), [1, 2]])) # this workaround successfully executes, but...
df.append({(MyEnum.A, 1): "abc", (MyEnum.B, 2): "xyz"}, ignore_index=True) # ... this "append" statement then raises the same error.
df.loc[0, [(MyEnum.A, 1), (MyEnum.B, 2)]] = 'abc', 'xyz' # this works, but is less desirable (can't pass a dict, need to come up with a row indexer, etc.)
Problem description
Though Enums can easily be used as column indexers, strange errors appear to arise when they are used (as one of the factors) in a MultiIndex.
The multiindex (and dataframe) can be created successfully if an (ordered) categorical Series is passed to the constructor. Yet in this case, appending rows in the usual way fails. One can create new rows using .loc
, and yet this is not as nice.
This whole situation can be avoided by using strings instead of an Enum. Alternatively, one can use an IntEnum---and yet this essentially uses the underlying integers, instead of the names, as the column indexers.
As the use of enums as columns is perfectly supported in the case of a simple index, it seems a shortcoming that they can't be used in a MultiIndex.
Expected Output
>>> df = pd.DataFrame(columns=pd.MultiIndex.from_product(iterables=[MyEnum, [1, 2]]))
>>> df
Empty DataFrame
Columns: [(MyEnum.A, 1), (MyEnum.A, 2), (MyEnum.B, 1), (MyEnum.B, 2)]
Index: []
>>> df.append({(MyEnum.A, 1): "abc", (MyEnum.B, 2): "xyz"}, ignore_index=True)
MyEnum.A MyEnum.B
1 2 1 2
0 abc NaN NaN xyz
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 17.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.0
pytest: None
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None