Skip to content

read_csv does not parse in header with BOM utf-8 #4793

Closed
@johnclinaa

Description

@johnclinaa

I am using Pandas version 0.12.0 on a Mac.

I noticed that when there is a BOM utf-8 file, and if the header row is in the first line, the read_csv() method will leave a leading quotation mark in the first column's name. However, if the header row is further down the file and I use the "header=" option, then the whole header row gets parsed correctly.

Here is an example code:

bing_kw = pd.read_csv('../../data/sem/Bing-Keyword_daily.csv', header=9, thousands=',', encoding='utf-8')

Parses the header correctly.

bing_kw = pd.read_csv('../../data/sem/Bing-Keyword_daily.csv', thousands=',', encoding='utf-8')

Parses the first header column name incorrectly by leaving the leading quotation mark.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions