Skip to content

BUG: Assigning values in SparseDataFrame with duplicate columns fails #14427

Closed
@bkandel

Description

@bkandel

As discussed in #14384 (comment).

A small, complete example of the issue

import pandas as pd 
df1 = pd.DataFrame({'a': [1, 2, 3]})
df2 = pd.DataFrame({'b': [2,3,4]})
df = pd.concat([df1, df1, df2], axis=1).to_sparse()
df.index = [1, 2, 3]
df.loc[1, 'a'] = 3

errors with

AttributeError                            Traceback (most recent call last)
<ipython-input-6-0a670748626a> in <module>()
      4 df = pd.concat([df1, df1, df2], axis=1).to_sparse()
      5 df.index = [1, 2, 3]
----> 6 df.loc[1, 'a'] = 3

/Users/bkandel/.virtualenvs/pandas_19/lib/python2.7/site-packages/pandas/core/indexing.pyc in __setitem__(self, key, value)
    138             key = com._apply_if_callable(key, self.obj)
    139         indexer = self._get_setitem_indexer(key)
--> 140         self._setitem_with_indexer(indexer, value)
    141 
    142     def _has_valid_type(self, k, axis):

/Users/bkandel/.virtualenvs/pandas_19/lib/python2.7/site-packages/pandas/core/indexing.pyc in _setitem_with_indexer(self, indexer, value)
    545                 # scalar
    546                 for item in labels:
--> 547                     setter(item, value)
    548 
    549         else:

/Users/bkandel/.virtualenvs/pandas_19/lib/python2.7/site-packages/pandas/core/indexing.pyc in setter(item, v)
    453 
    454             def setter(item, v):
--> 455                 s = self.obj[item]
    456                 pi = plane_indexer[0] if lplane_indexer == 1 else plane_indexer
    457 

/Users/bkandel/.virtualenvs/pandas_19/lib/python2.7/site-packages/pandas/sparse/frame.pyc in __getitem__(self, key)
    345             return self._getitem_array(key)
    346         else:
--> 347             return self._get_item_cache(key)
    348 
    349     @Appender(DataFrame.get_value.__doc__, indents=0)

/Users/bkandel/.virtualenvs/pandas_19/lib/python2.7/site-packages/pandas/core/generic.pyc in _get_item_cache(self, item)
   1385         if res is None:
   1386             values = self._data.get(item)
-> 1387             res = self._box_item_values(item, values)
   1388             cache[item] = res
   1389             res._set_as_cached(item, self)

/Users/bkandel/.virtualenvs/pandas_19/lib/python2.7/site-packages/pandas/core/frame.pyc in _box_item_values(self, key, values)
   2392         items = self.columns[self.columns.get_loc(key)]
   2393         if values.ndim == 2:
-> 2394             return self._constructor(values.T, columns=items, index=self.index)
   2395         else:
   2396             return self._box_col_values(values, items)

AttributeError: 'BlockManager' object has no attribute 'T'

Expected Output

   a  a  b
1  3  3  2
2  2  2  3
3  3  3  4

Output of pd.show_versions()

## INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 28.3.0
Cython: None
numpy: 1.11.2
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesSparseSparse Data Type

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions