Skip to content

Python 3 writing to_csv file ignores encoding argument. #13068

Closed
@graingert

Description

@graingert
# is missing the UTF8 BOM (encoded with default encoding UTF8)
with open('path_to_f', 'w') as f:
    df.to_csv(f, encoding='utf-8-sig')

# is not missing the UTF8 BOM (encoded with passed encoding utf-8-sig)
df.to_csv('path_to_f', encoding='utf-8-sig')

I expect:

with open('path_to_f', 'w') as f:
    df.to_csv(f, encoding='utf-8-sig')

To crash with TypeError: write() argument must be str, not bytes

and I expect:

with open('path_to_f', 'wb') as f:
    df.to_csv(f, encoding='utf-8-sig')

To write the file correctly.

Copy pasta

#!/usr/bin/env python3
import pandas as pd
df = pd.DataFrame()
with open('file_one', 'w') as f:
    df.to_csv(f, encoding='utf-8-sig')

assert open('file_one', 'rb').read() == b'""\n'

# is not missing the UTF8 BOM (encoded with passed encoding utf-8-sig)
df.to_csv('file_two', encoding='utf-8-sig')
assert open('file_two', 'rb').read() == b'\xef\xbb\xbf""\n'

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugError ReportingIncorrect or improved errors from pandasIO CSVread_csv, to_csvUnicodeUnicode strings

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions