Skip to content

API: DataFrame(list_with_ea) #49593

Open
Open
@jbrockmendel

Description

@jbrockmendel

DataFrame(nested_list) generally treats each list entry as a row. The exception (until #49592) is if the first element is a Categorical in which we treated elements as columns. We have zero tests passing a list[EA] except when the first entry is a Categorical, which suggests to me that the behavior here has not gotten much attention.

For non-EA and some EAs (Period, dt64tz, interval) inference basically works right. But for Categorical, pyarrow, and presumably 3rd-party EAs it doesn't:

df = pd.DataFrame(np.arange(9).reshape(3, 3)).astype("int64[pyarrow]")
arrs = [df[i]._values for i in range(3)]
pd.DataFrame(arrs).dtypes  # <- back to numpy dtypes

We could reasonably use concat-like logic for dtype inference. (or for EAs that support 2D could just concat directly).

Another approach that might work is to do something like DataFrame({i: rows[i] for i in range(len(rows))}).T. That works in the pyarrow example above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ConstructorsSeries/DataFrame/Index/pd.array Constructors

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions