Skip to content

SUB-character in a csv causes read_csv() with C-Engine to detect EOF #16893

Closed
@Khris777

Description

@Khris777

Problem description

If there is a SUB-character in a string in a csv, read_csv() with the standard C-engine returns

ParserError: Error tokenizing data. C error: EOF inside string starting at line 0

The Python-engine can read the file fine.

It seems I can't put example data with a SUB-character here, so I pasted an example line here instead:
https://pastebin.com/x6QPY4Hf
Just paste the line into a csv and try to read it with read_csv().

I don't know if this behaviour is expected or not since this character is indeed used as EOF in certain cases, however I see little sense in having a SUB character interpreted as EOF in the middle of a csv file.

commit: None

python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64

pandas: 0.20.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO CSVread_csv, to_csvTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions