Description
Dear all,
Here from the mailing list: https://groups.google.com/forum/#!topic/pydata/jKiPOvYUQ1c
I have an excel table about family ages like this
Family | People | Mean size [cm] |
---|---|---|
Foo | 5 | 173.0 |
Bar | 3 | 189.0 |
and I would like to use read_excel to parse it into Python. I would like "People" to be read as an integer, "Mean size [cm]" as a float. (And "Family" as a string, but that might be a different issue.) Now:
- if I set convert_float=True, the last column reads as int
- if I set convert_float=False, the second column reads as float
Neither one is correct, for a stupid reason: there happen to be those .0 in all sizes! So I would like to specify something like:
convert_float = ['People']
so only that column gets converted. An even better solution would be to be explicit about types of some columns, letting pandas perform the automagic for the others, such as:
read_excel('foo.xlsx', types={'People': np.uint8, 'Family': 'S3'})
but this changes the signature of the function more significantly.
Are you folks in favour of any of this? If yes, I can get a look and try to code it in.