Closed
Description
When concatting two dataframes where there are a) there are duplicate columns in one of the dataframes, and b) there are non-overlapping column names in both, then you get a IndexError:
In [9]: df1 = pd.DataFrame(np.random.randn(3,3), columns=['A', 'A', 'B1'])
...: df2 = pd.DataFrame(np.random.randn(3,3), columns=['A', 'A', 'B2'])
In [10]: pd.concat([df1, df2])
Traceback (most recent call last):
File "<ipython-input-10-f61a1ab4009e>", line 1, in <module>
pd.concat([df1, df2])
...
File "c:\users\vdbosscj\scipy\pandas-joris\pandas\core\index.py", line 765, in take
taken = self.view(np.ndarray).take(indexer)
IndexError: index 3 is out of bounds for axis 0 with size 3
I don't know if it should work (although I suppose it should, as with only the duplicate columns it does work), but at least the error message is not really helpfull.