Skip to content

API: read_csv inconsistent with from_csv -- parses ints as dates #3418

Closed
@darindillon

Description

@darindillon

Using pandas 0.10.1.
I read the docs, but didn't see any explanation of why this would be true. pandas.read_csv() works exactly as you'd expect, but pandas.DataFrame.from_csv() is different. Looks like the latter method assumes you're probably dealing with time series data, so it sets defaults parameters to automatically convert integers to dates. I disagree that this is desired, but even if it is, why would it be true for the later method but not the former? Why shouldn't both methods assume the same default assumptions?

Create a CSV like this:
a,b
1,4
2,3

Now this does exactly what you'd expect:
p = pandas.read_csv(your_csv_file)

But this converts the first column into a data. Almost certainly not what you'd expect:
p = pandas.DataFrame.from_csv(your_csv_file)

There is an optional parameter on the second method "parse_dates" which is default False. If you add that flag, then the second method works just like the first. But why the inconsistency? I'd expect this method to default to acting just like the other one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvOutput-Formatting__repr__ of pandas objects, to_string

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions