Closed
Description
Functions select_dtypes
and get_dummies
have strange and incorrect behavior on duplicate column names. Shown below:
In [6]: df
Out[6]:
col1 col1
0 1 a
1 2 b
In [7]: df.select_dtypes(include=['int'])
Out[7]:
Empty DataFrame
Columns: []
Index: [0, 1]
In [8]: pd.get_dummies(df)
Out[8]:
col1_('c', 'o', 'l', '1') col1_('c', 'o', 'l', '1')
0 1 1
1 1 1