Skip to content

BUG: unsuccessful value assignment of data frame with an n*2 list when using pandas loc  #45769

Open
@jialusui

Description

@jialusui

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

#building test dataframe using fake data
data = [0,0,0,0,0,0]
df_test = pd.DataFrame(data,columns=['test'])
df_test['index'] = ['a','a','a','b','c','c']
df_test['day'] = [1,1,2,3,3,4]
df_test=df_test.set_index(['index','day'])

# fail when we try to assign 'test' column of a loc subset to a n*2 list
df_test.loc[('a',1),'test']=[['test1','test11'],['test2','test22']]
print(df_test)

#successful case when we does not assign 'test' column to a strict n*2 list
df_test.loc[('a',1),'test']=[['test1','test11'],['test2']]
print(df_test)

Issue Description

The issue arrives when I was trying to assign a columns of a loc subset of pandas dataframe to a n*2 list. I expect the value assignment to be successful and expect df_test.loc[('a',1)] to be the same as test_copy in the cell below.

Although the error shows 'Must have equal len keys and value when setting with an ndarray' error, the length of the subset and the length of the list used in the assignment is actually both 2 (we can easily observe this since this is a small dataframe)

If we make the list not stricly n*2, we do not have the same error.

I guess this is an internal bug of pandas (or specifically pandas loc) since this behavior is extremely weird and definitely not expected.

Expected Behavior

test_copy = df_test.loc[('a',1)]
test_copy['test'] = [['test1','test11'],['test2','test22']]
test_copy

Installed Versions

INSTALLED VERSIONS

commit : 66e3805
python : 3.7.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.144+
Version : #1 SMP Tue Dec 7 09:58:10 PST 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.5
numpy : 1.19.5
pytz : 2018.9
dateutil : 2.8.2
pip : 21.1.3
setuptools : 57.4.0
Cython : 0.29.26
pytest : 3.6.4
hypothesis : None
sphinx : 1.8.6
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.2.6
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 5.5.0
pandas_datareader: 0.9.0
bs4 : 4.6.3
bottleneck : 1.3.2
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.8.1
odfpy : None
openpyxl : 2.5.9
pandas_gbq : 0.13.3
pyarrow : 3.0.0
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.4.31
tables : 3.4.4
tabulate : 0.8.9
xarray : 0.18.2
xlrd : 1.1.0
xlwt : 1.3.0
numba : 0.51.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndexNested DataData where the values are collections (lists, sets, dicts, objects, etc.).

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions