Provide a way to convert Arrow tables to Arrow-backed dataframes

As far as I could see, there is no easy way given a PyArrow table, to get a DataFrame with pyarrow types.

I'd expect that those idioms work:

```python
import numpy
import pyarrow
import pandas

arrow_u8 = pyarrow.array([1, 2, 3], type=pyarrow.uint8())
arrow_f64 = pyarrow.array([1., 2., 3.], type=pyarrow.float64())
table = pyarrow.table([arrow_u8, arrow_f64], names=['u8', 'f64'])

# Using the PyArrow `to_pandas` method will use NumPy backed data
df = table.to_pandas()

# Using the constructor with a PyArrow table raises: ValueError: DataFrame constructor not properly called!
df = pandas.DataFrame(table)

# This is not implemented (the method doesn't exist)
df = pandas.DataFrame.from_arrow(table)

# Creating a dataframe column by column naively from the arrow array will use NumPy dtypes
df = pandas.DataFrame({'u8': arrow_u8,
                       'f64': arrow_f64})
```

I think the easier way to make the transition is with something like this:

```python
df =  pandas.DataFrame({name: pandas.Series(array,
                                            dtype=pandas.ArrowDtype(array.type))
                        for array, name
                        in zip(table.columns, table.column_names)})
```

@pandas-dev/pandas-core Given that Arrow dtypes is one of the highlights of pandas 2.0, shouldn't we provide at least one easy way to convert before the release?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a way to convert Arrow tables to Arrow-backed dataframes #51760

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Provide a way to convert Arrow tables to Arrow-backed dataframes #51760

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions