Fix for DataFrames with MultiIndex columns #166
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Calling
fit_transform()
on aDataFrameMapper
for DataFrames with a multi-level column index often throws the following error:TypeError: sequence item 0: expected str instance, tuple found
I fixed this by mapping the column-name tuples to
str
. This change also fixes the related #143.Along with this change I have created a test and some fixtures to work with MultiIndex-column DataFrames. Tox tests pass for all supplied virtualenv configurations, but fail in an unrelated place for the latest pandas version (I believe this is being fixed elsewhere).
A minor issue that still remains: the column names of the transformed DataFrame are no longer the same as in the original DataFrame because of the tuple to string conversion. The error fix had a higher priority in my own use case, but I am still thinking of a way in which the names are kept the same in cases where it's possible without breaking other cases (i.e. a simple
eval
won't cut it).