Closed
Description
When using .loc
to expand a column, the dataframe and the constituent Series can get out of sync. This led to a strange issue in some legacy code I inherited. Because of the copy/view issues I wouldn't have written this particular code myself, but it still cost me some time tracking down what was happening:
df = pd.DataFrame({"a": [10,20,30]})
df["a"].loc[4] = 40
gives me
In [224]: df
Out[224]:
a
0 10
1 20
2 30
In [225]: df["a"]
Out[225]:
0 10
1 20
2 30
4 40
Name: a, dtype: int64
In [226]: df.shape, df["a"].shape
Out[226]: ((3, 1), (4,))
which is a little unnerving. I expected that df
and df["a"]
would both be unchanged or both be changed. Not sure if it's too much trouble to fix and we should just say "don't move your arm like that", though.
@jorisvandenbossche reported the same thing in master; here I'm on 0.18.0.
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.18.0
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0