Skip to content

read_csv character encoding bug? #2741

Closed
@hayd

Description

@hayd

This is a weird one from StackOverflow, this file has some \x00s which seem to be ignored when printing but confuse read_csv:

x = 'x,y\n \x00\x00\x00,Reg\n \x00\x00\x00,Reg\nI,Swp\nI,Swp\n'
X = StringIO(x)

In [3]: pd.read_csv(X)
Out[3]: 
     x    y
0          
1  NaN  NaN
2    I  Swp
3    I  Swp

In [4]: print x
x,y
 ,Reg
 ,Reg
I,Swp
I,Swp

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions