Skip to content

ENH: read_excel dtypes and converts #8212

Closed
@iosonofabio

Description

@iosonofabio

Dear all,

Here from the mailing list: https://groups.google.com/forum/#!topic/pydata/jKiPOvYUQ1c

I have an excel table about family ages like this

Family People Mean size [cm]
Foo 5 173.0
Bar 3 189.0

and I would like to use read_excel to parse it into Python. I would like "People" to be read as an integer, "Mean size [cm]" as a float. (And "Family" as a string, but that might be a different issue.) Now:

  • if I set convert_float=True, the last column reads as int
  • if I set convert_float=False, the second column reads as float

Neither one is correct, for a stupid reason: there happen to be those .0 in all sizes! So I would like to specify something like:

convert_float = ['People']

so only that column gets converted. An even better solution would be to be explicit about types of some columns, letting pandas perform the automagic for the others, such as:

read_excel('foo.xlsx', types={'People': np.uint8, 'Family': 'S3'})

but this changes the signature of the function more significantly.

Are you folks in favour of any of this? If yes, I can get a look and try to code it in.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions