Skip to content

select_dtypes and get_dummies break on duplicate dolumns #20848

Closed
@kunalgosar

Description

@kunalgosar

Functions select_dtypes and get_dummies have strange and incorrect behavior on duplicate column names. Shown below:

In [6]: df
Out[6]: 
  col1 col1
0    1    a
1    2    b

In [7]: df.select_dtypes(include=['int'])
Out[7]: 
Empty DataFrame
Columns: []
Index: [0, 1]

In [8]: pd.get_dummies(df)
Out[8]: 
   col1_('c', 'o', 'l', '1')  col1_('c', 'o', 'l', '1')
0                          1                          1
1                          1                          1

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions