Skip to content

first line comments on a read_csv #4623

Closed
@hayd

Description

@hayd

related #4505

It seems that commenting on the first line is a little buggy (or perhaps not well-defined):

In [11]: s1 = '# notes\na,b,c\n# more notes\n1,2,3'

In [12]: s2 = 'a,b,c\n# more notes\n1,2,3'

In [13]: pd.read_csv(StringIO(s1), comment='#')
Out[13]: 
        Unnamed: 0
a   b            c
NaN NaN        NaN
1   2            3

In [14]: pd.read_csv(StringIO(s2), comment='#')
Out[14]: 
    a   b   c
0 NaN NaN NaN
1   1   2   3

If you ignore the header:

In [15]: pd.read_csv(StringIO(s1), comment='#', header=None)
CParserError: Error tokenizing data. C error: Expected 1 fields in line 2, saw 3

related #3001 and from this SO question.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions