Skip to content

pd.to/read_sql_table silently corrupts Categorical columns #8624

Closed
@kay1793

Description

@kay1793
In [29]: from sqlalchemy import create_engine
    ...: engine = create_engine('sqlite://')
    ...: df=pd.DataFrame([[1,'John P. Doe'],[2,'Jane Dove'],[1,'John P. Doe']],
    ...: columns=['person_id','person_name'])
    ...: df.to_sql('data1',engine)
    ...: df['person_name']=pd.Categorical(df.person_name)
    ...: df.to_sql('data2',engine)
    ...: print pd.read_sql_table('data1',engine)
    ...: print pd.read_sql_table('data2',engine)
   index  person_id  person_name
0      0          1  John P. Doe
1      1          2    Jane Dove
2      2          1  John P. Doe
   index  person_id person_name
0      0          1           J
1      1          2           o
2      2          1           h

using relational db to store catagorical columns in seperare tables would be very cool, and rebuilding the frame in pandas by JOIN from the multiple tables would save time on the wire. also memory if categorical was build directly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCategoricalCategorical Data TypeIO SQLto_sql, read_sql, read_sql_query

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions