Skip to content

DataFrame.loc[n] = dict(..) fails with some type combinations #16309

Closed
@bmcfee

Description

@bmcfee

Code Sample, a copy-pastable example if possible

This one fails:

# Your code here
In [9]: d = pd.DataFrame(columns=['time', 'value'])                    
In [9]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-9-b557eb950858> in <module>()
----> 1 d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    177             key = com._apply_if_callable(key, self.obj)
    178         indexer = self._get_setitem_indexer(key)
--> 179         self._setitem_with_indexer(indexer, value)
    180 
    181     def _has_valid_type(self, k, axis):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    423                                        name=indexer)
    424 
--> 425                     self.obj._data = self.obj.append(value)._data
    426                     self.obj._maybe_update_cacher(clear=True)
    427                     return self.obj

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in append(self, other, ignore_index, verify_integrity)
   4628             other = DataFrame(other.values.reshape((1, len(other))),
   4629                               index=index,
-> 4630                               columns=combined_columns)
   4631             other = other._convert(datetime=True, timedelta=True)
   4632             if not self.columns.equals(combined_columns):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    304             else:
    305                 mgr = self._init_ndarray(data, index, columns, dtype=dtype,
--> 306                                          copy=copy)
    307         elif isinstance(data, (list, types.GeneratorType)):
    308             if isinstance(data, types.GeneratorType):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in _init_ndarray(self, values, index, columns, dtype, copy)
    481             values = maybe_infer_to_datetimelike(values)
    482 
--> 483         return create_block_manager_from_blocks([values], [columns, index])
    484 
    485     @property

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in create_block_manager_from_blocks(blocks, axes)
   4294                                      placement=slice(0, len(axes[0])))]
   4295 
-> 4296         mgr = BlockManager(blocks, axes)
   4297         mgr._consolidate_inplace()
   4298         return mgr

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
   2790                     raise AssertionError('Number of Block dimensions (%d) '
   2791                                          'must equal number of axes (%d)' %
-> 2792                                          (block.ndim, self.ndim))
   2793 
   2794         if do_integrity_check:

AssertionError: Number of Block dimensions (1) must equal number of axes (2)

But this one succeeds:

In [11]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value=5)

In [12]: d
Out[12]: 
      time value
0 00:00:05     5

This one also succeeds:

In [13]: d = pd.DataFrame(columns=['time', 'value'])

In [14]: d.loc[0] = dict(time=3, value='foo')

In [15]: d
Out[15]: 
  time value
0    3   foo

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

The current behavior is a problem because it is inconsistent, and depends on the type of data provided. Mixing timedelta with str fails, but timedelta with int works, as does int with str.

I believe this is related to aggressive type inference previously noted in #13829.

Expected Output

Not crashing.

Output of pd.show_versions()

In [16]: pd.show_versions() /home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/xarray/core/formatting.py:16: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. from pandas.tslib import OutOfBoundsDatetime

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-77-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 35.0.2
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: 0.9.5
IPython: 6.0.0
sphinx: 1.5.5
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.6.0
feather: None
matplotlib: 2.0.1
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
jinja2: 2.9.5
s3fs: 0.1.0
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Dtype ConversionsUnexpected or buggy dtype conversionsIndexingRelated to indexing on series/frames, not to indexes themselvesNeeds TestsUnit test(s) needed to prevent regressionsgood first issue

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions