Description
In scikit-learn, when testing the pandas nightly build, we got a FutureWarning
related to the following deprecation:
We have 2 related questions regarding this deprecation. First, it seems that we cannot reproduce the "Old Behaviour" with the latest available release:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: pd.__version__
Out[3]: '1.4.2'
In [4]: values = np.arange(4).reshape(2, 2)
...:
...: df = pd.DataFrame(values)
...:
...: ser = df[0]
In [5]: df.iloc[:, 0] = np.array([10, 11])
In [6]: ser
Out[6]:
0 10
1 11
Name: 0, dtype: int64
Is there a reason for not spotting the behaviour shown in the documentation?
The second question (actually it is more a comment to open a discussion) concerns the proposed fix.
It is proposed to use df[df.columns[i]] = newvals
instead of the df.iloc[:, i] = newvals
.
I personally find this way a bit counterintuitive since the SettingWithCopyWarning
proposes to change to df.loc[rows, cols]
instead of df[cols][rows]
to get the inplace behaviour.
If we consider that both approaches intend for an inplace change, the patterns used for "by position" (i.e. .iloc
) or "by label" (i.e. .loc
) are really different.