Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
df = pd.DataFrame()
# Option 1: works on empty dataframes (adding an empty column)
# but shows the SettingWithCopyWarning for non-empty views/copies
df["a"] = 1
# Option 2: works without warning for views/copies but raises ValueError on empty dataframe
df.loc[:, "a"] = 1
Problem description
Let's consider a function add_column
that adds a column.
- If we use
df[column] = value
(Option 1), then the function will throw theSettingWithCopyWarning
whenever it is called on a copy/view (even if we don't care about propagating the change to the original dataframe). - The recommended workaround for this warning is to use
df.loc[:, column] = value
(Option 2). However, this throws as soon as the dataframe is empty, i.e. doesn't contain any rows
This then requires ugly solutions like the following
def add_column(df):
if df.empty:
# Still want to make sure to add the column to avoid KeyErrors later
df["column"] = 1 # doesn't show SettingWithCopyWarning
return
df.loc[:, "column"] = 1
whenever we might be dealing with dataframes or their copies/views that are possibly empty.
INSTALLED VERSIONS
commit : 2cb9652
python : 3.7.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.3.0-64-generic
Version : #58-Ubuntu SMP Fri Jul 10 19:33:51 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.2.4
numpy : 1.18.1
pytz : 2019.2
dateutil : 2.7.3
pip : 20.3.3
setuptools : 41.1.0
Cython : None
pytest : 5.3.2
hypothesis : None
sphinx : 3.0.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.5.0
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 5.8.0
pandas_datareader: None
bs4 : 4.9.1
bottleneck : None
fsspec : 0.7.4
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 3.0.0
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.16
tables : 3.6.1
tabulate : 0.8.7
xarray : 0.17.0
xlrd : 2.0.1
xlwt : None
numba : 0.51.2