Skip to content

na_filter=False ignored when index_col set #5239

Closed
@cancan101

Description

@cancan101

Given the following CSV file:

u1,u2,u3,d1,d2,d3,d4
Good Things,C,,1,1,1,1
Good Things,R,,1,1,1,1
Bad Things,C,,1,1,1,1
Bad Things,T,,1,1,1,1
Okay Things,N,B,1,1,1,1
Okay Things,N,D,1,1,1,1
Okay Things,B,,1,1,1,1
Okay Things,D,,1,1,1,1

First I parse with na_filter=True:

In [13]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=True)
Out[13]: 
            u1 u2   u3  d1  d2  d3  d4
0  Good Things  C  NaN   1   1   1   1
1  Good Things  R  NaN   1   1   1   1
2   Bad Things  C  NaN   1   1   1   1
3   Bad Things  T  NaN   1   1   1   1
4  Okay Things  N    B   1   1   1   1
5  Okay Things  N    D   1   1   1   1
6  Okay Things  B  NaN   1   1   1   1
7  Okay Things  D  NaN   1   1   1   1

then I parse with na_filter=False:

In [12]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=False)
Out[12]: 
            u1 u2 u3  d1  d2  d3  d4
0  Good Things  C      1   1   1   1
1  Good Things  R      1   1   1   1
2   Bad Things  C      1   1   1   1
3   Bad Things  T      1   1   1   1
4  Okay Things  N  B   1   1   1   1
5  Okay Things  N  D   1   1   1   1
6  Okay Things  B      1   1   1   1
7  Okay Things  D      1   1   1   1

then index_cols set:

In [11]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=False,index_col=[0,1,2],)
Out[11]: 
                    d1  d2  d3  d4
u1          u2 u3                 
Good Things C  NaN   1   1   1   1
            R  NaN   1   1   1   1
Bad Things  C  NaN   1   1   1   1
            T  NaN   1   1   1   1
Okay Things N  B     1   1   1   1
               D     1   1   1   1
            B  NaN   1   1   1   1
            D  NaN   1   1   1   1

Finally setting na_values=[], keep_default_na=False seems to fix the issue:

In [14]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=False,index_col=[0,1,2],na_values=[], keep_default_na=False)
Out[14]: 
                   d1  d2  d3  d4
u1          u2 u3                
Good Things C       1   1   1   1
            R       1   1   1   1
Bad Things  C       1   1   1   1
            T       1   1   1   1
Okay Things N  B    1   1   1   1
               D    1   1   1   1
            B       1   1   1   1
            D       1   1   1   1

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions