Skip to content

BUG: concat on axis with both different and duplicate labels raising error #6963

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

When concatting two dataframes where there are a) there are duplicate columns in one of the dataframes, and b) there are non-overlapping column names in both, then you get a IndexError:

In [9]: df1 = pd.DataFrame(np.random.randn(3,3), columns=['A', 'A', 'B1'])
   ...: df2 = pd.DataFrame(np.random.randn(3,3), columns=['A', 'A', 'B2'])

In [10]: pd.concat([df1, df2])

Traceback (most recent call last):
  File "<ipython-input-10-f61a1ab4009e>", line 1, in <module>
    pd.concat([df1, df2])
...
  File "c:\users\vdbosscj\scipy\pandas-joris\pandas\core\index.py", line 765, in take
    taken = self.view(np.ndarray).take(indexer)
IndexError: index 3 is out of bounds for axis 0 with size 3

I don't know if it should work (although I suppose it should, as with only the duplicate columns it does work), but at least the error message is not really helpfull.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions