Closed
Description
xref #14351
None of the following merge operations retain the category
types. Is this expected? How can I keep them?
Merging on a category
type:
Consider the following:
A = pd.DataFrame({'X': np.random.choice(['foo', 'bar'],size=(10,)),
'Y': np.random.choice(['one', 'two', 'three'], size=(10,))})
A['X'] = A['X'].astype('category')
B = pd.DataFrame({'X': np.random.choice(['foo', 'bar'],size=(10,)),
'Z': np.random.choice(['jjj', 'kkk', 'sss'], size=(10,))})
B['X'] = B['X'].astype('category')
if I do the merge, we end up with:
> pd.merge(A, B, on='X').dtypes
X object
Y object
Z object
dtype: object
Merging on a non-category
type:
A = pd.DataFrame({'X': np.random.choice(['foo', 'bar'],size=(10,)),
'Y': np.random.choice(['one', 'two', 'three'], size=(10,))})
A['Y'] = A['Y'].astype('category')
B = pd.DataFrame({'X': np.random.choice(['foo', 'bar'],size=(10,)),
'Z': np.random.choice(['jjj', 'kkk', 'sss'], size=(10,))})
B['Z'] = B['Z'].astype('category')
if I do the merge, we end up with:
pd.merge(A, B, on='X').dtypes
X object
Y object
Z object
dtype: object