Description
There is a problem with assigning lists to indexed Series
from math import nan
import pandas as pd
s1 = pd.Series([nan, nan, "c"])
s1[s1.isnull()] = ["a","b"]
s2 = pd.Series([nan, "b", nan])
s2[s2.isnull()] = ["a","c"]
#s1: "a", "b", "c"
#s2: "a", "b", "a"
s1
shows the correct result, whereas s2
is obviously wrong. In pandas 0.19.2
the result is correct, but in 0.21.0
and 0.23.0.dev0+97.g24c07b075
I get the above behavior. It can be fixed if a numpy array instead of a python list is assigned.
from math import nan
import pandas as pd
import numpy as np
s1 = pd.Series([nan, nan, "c"])
s1[s1.isnull()] = np.array(["a","b"])
s2 = pd.Series([nan, "b", nan])
s2[s2.isnull()] = np.array(["a","c"])
#s1: "a", "b", "c"
#s2: "a", "b", "c"
There probably should be a test case to catch that...
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US
LOCALE: None.None
pandas: 0.23.0.dev0+97.g24c07b075
pytest: 3.2.3
pip: 9.0.1
setuptools: 38.2.4
Cython: 0.27.3
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.1.13
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None