Closed
Description
removed_unused_categories
gives a wrong result when there is a NaN in the values:
>>> s = pd.Series(["A", "B", pd.np.nan]).astype("category")
>>> s.cat.remove_unused_categories()
0 A
1 B
2 NaN
dtype: category
Categories (3, object): [B, A, B]
I think the -1
in the codes (NaN
) duplicates the last item in the categories