Skip to content

QST: Misleading SettingWithCopyWarning when assigning on result of DataFrame.dropna? #39448

Closed
@astrojuanlu

Description

@astrojuanlu
  • I have searched the [pandas] tag on StackOverflow for similar questions.

  • I have asked my usage related question on StackOverflow.


Under certain cases, doing an assignment on a DataFrame created using .dropna() emits the infamous SettingWithCopyWarning. So I followed what this SO answer suggests, and inspected the ._is_view and ._is_copy attributes:

In [1]: import pandas as pd                                                                                                                                                                                                                   

In [2]: import numpy as np                                                                                                                                                                                                                    

In [3]: df1 = pd.DataFrame([[1.0, np.nan, 2, np.nan], 
   ...:                    [2.0, 3.0, 5, np.nan], 
   ...:                    [np.nan, 4.0, 6, np.nan]], columns=list("ABCD")); df1                                                                                                                                                              
Out[3]: 
     A    B  C   D
0  1.0  NaN  2 NaN
1  2.0  3.0  5 NaN
2  NaN  4.0  6 NaN

In [4]: df2 = df1.dropna(axis="columns", how="all")   

In [5]: df2._is_view, df2._is_copy                                                                                                                                                                                                            
Out[5]: (False, <weakref at 0x7f56e8ecfd60; to 'DataFrame' at 0x7f56e916f9d0>)

In [6]: df2["A"] = df2["A"].fillna(df2["B"])                                                                                                                                                                                                  
<ipython-input-6-9c99ac454f90>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df2["A"] = df2["A"].fillna(df2["B"])                                                                                                                                                                                        

However, this _is_copy is set to None if you actually call .copy() (misleading variable name?), which on the other hand removes the warning:

In [17]: df3 = df1.dropna(axis="columns", how="all").copy()                                                                                                                                                                                   

In [18]: df3._is_view, df3._is_copy                                                                                                                                                                                                           
Out[18]: (False, None)

In [19]: df3["A"] = df3["A"].fillna(df3["B"])                                                                                                                                                                                                 

In [20]: 

But, finally, .fillna result does have _is_copy set to None:

In [20]: df1 = pd.DataFrame([[1.0, np.nan], [2.0, 3.0]], columns=["A", "B"]); df1                                                                                                                                                             
Out[20]: 
     A    B
0  1.0  NaN
1  2.0  3.0

In [21]: df2 = df1.dropna(axis="columns", how="all")                                                                                                                                                                                          

In [22]: df2._is_view, df2._is_copy                                                                                                                                                                                                           
Out[22]: (False, None)

In [23]: df2["A"] = df2["A"].fillna(df2["B"])                                                                                                                                                                                                 

In [24]:  

Can anybody explain what's the meaning of the SettingWithCopyWarning in this context and why do I still get despite _is_view being False in all cases?

I am particularly worried about this, in light of what the pandas docs say:

Sometimes a SettingWithCopy warning will arise at times when there’s no obvious chained indexing going on. These are the bugs that SettingWithCopy is designed to catch!

Originally asked at https://stackoverflow.com/q/65926892/554319

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions