Skip to content

ENH: Support Multi-Index for columns in parquet format #34777

Closed
@yohplala

Description

@yohplala

Is your feature request related to a problem?

I would like to save DataFrame with Multi-Index used for columns into parquet format.
This is currently not possible.

import pandas as pd
df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns= pd.MultiIndex.from_product([['1'],['a', 'b', 'c']]))
df.to_parquet('test.parquet')
""" doesn't work, whatever the engine """
ValueError: parquet must have string column names

Describe alternatives you've considered

To do so with actual state of library, a piece of code has been shared on SO.
I will use it for now.

import pyarrow as pa
import pyarrow.parquet as pq
table = pa.Table.from_pandas(df)
pq.write_table(table, 'test.parquet')
df_test_read = pd.read_parquet('test.parquet')

Thanks for your help!
Bests,

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions