Description
It would be great to generally apply graceful degradation for export of categorical data instead of raising exceptions.
Currently this is only the case for to_sql
and to_csv
, where the categories are exported, while to_pickle
is the only option to persist categorical data
For Stata and HDF it is:
to_hdf
:NotImplementedError: cannot store a category dtype
to_stata
:ValueError: Data type category not currently understood. Please report an error to the developers.
As long as a backend does not support categoricals or the conversion is not yet implemented, why not generally export categories as a fallback? With the separately discussed decode method (#8628) this would be easy. If the same rigor (backend supports data type natively or fail) would be applied to CSV-IO we could only export string dtypes to CSV.
Thinking one step further, the to_...
functions could have an optional parameter named something like convert_cat
with options:
- None: either try to export as a categorical (pickle, potentially HDF, Stata) or raise exception
- 'category': only export categories (decode method)
- 'code': export s.cat.codes
- 'mapping' or 'emulate': export code:category mapping in one/two columns or separate table/frame/... with the code-category mapping.
The last option would probably need additional parameters to control the technical implementation (e.g. table name for mapping or suffixes as for join/merge, ...)