Skip to content

usecols dooesn't help with unclean csv's #9549

Closed
@harshnisar

Description

@harshnisar

So I have a lot of csv's which are clean till say 7 columns and have no missing values but have strings at random places starting after column 7.

I know it is clean till only 7. So, when I say usecols and list the 7 columns, I want it to ignore the other columns, probably truncate the remaining parts in the row when reading too. Shouldn't that be the functionality?
I don't want to skip over bad lines.

Is there a way in which I can force pandas to only read 7 columns and expect 7 rows while reading and hence not raise an exception?

Another method is to use names = range(35), an arbitrarily large number. But then I lose the real headers in my file and can't say what they are talking about. These columns are not fixed.

edit: It's my first issue report in a huge python package. Please bear if I didn't follow any protocol.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions