Closed
Description
So here's something I would like. As an avid pandas user, I'd like to be able to write and read csv's to and from a dataframe including the dtypes of each column.
Reading up on pandas, I thought this does the trick in the most Pythonic way:
import ast
import pandas as pd
# dataframe as example
df = pd.DataFrame(data={'int': [1, 2, 3],
'float': [1.0, 2.0, 3.0],
'bool': [True, False, True],
'date': ['2018-03-01', '1973-09-09', '2009-05-20',]},)
df.date = df.date.astype('datetime64[ns]')
# write .csv with comment that lists dtypes
with open('test.csv', 'w') as f:
f.write('#' + str(df.dtypes.apply(lambda x: x.name).to_dict()) + '\n')
df.to_csv(f, index=False, )
# read .csv with comment line to parse dates and dtypes
import ast
from collections import Counter
with open('test.csv', 'r') as f:
type_header = f.readline()
dtypes = ast.literal_eval(type_header[types.index('#') + 1:type_header.index('}\n')+1])
parse_dates = [k for k,v in dtypes.items() if v in ['datetime64[ns]', 'datetime64[ns, tz]', 'timedelta[ns]']]
dtypes = {k: v for k,v in dtypes.items() if k not in parse_dates}
foo = pd.read_csv(f, comment='#', dtype=dtypes, parse_dates=parse_dates)
foo.dtypes.all() == df.dtypes.all()
Is this something which is worth including, or is it not generic enough and should I just hack my own extension on the Dataframe class?
Metadata
Metadata
Assignees
Labels
No labels