BUG: df fails when columns arg is a list containing dupes

```
In [1]: DataFrame(data,columns=["a","a"])

...
pandas/pandas/core/internals.pyc in _stack_dict(dct, ref_items, dtype)
1344 stacked = np.empty(shape, dtype=dtype)
1345 for i, item in enumerate(items):
-> 1346 stacked[i] = _asarray_compat(dct[item])
1347
1348 # stacked = np.vstack([_asarray_compat(dct[k]) for k in items])

IndexError: index out of bounds
```

5e6db32e is a failing test for this.

it looks like `_to_sdict` threads down to a call to `_convert_object_array` which builds a dict
keyed on column names, so dupe columns  get squashed and you end up with a mismatch 
between the length of the `columns` arg to `df.__init__` and the data.
_to_sdict is not used for ndarrays so this doesn't haoppen, I was able to reuse 
`_init_ndarray` for the case of `columns` being a flat list and have things work as expected.

still, too much code touching this, better left to the core devs to decide how to handle this.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: df fails when columns arg is a list containing dupes #2079

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG: df fails when columns arg is a list containing dupes #2079

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions