Skip to content

ENH: support the Arrow PyCapsule Interface for importing data #59631

Open
@jorisvandenbossche

Description

@jorisvandenbossche

We have #56587 and #59518 now for exporting pandas DataFrame and Series through the Arrow PyCapsule Interface (i.e. adding __arrow_c_stream__ methods), but we don't yet have the import counterpart.

For importing, the specification doesn't provide any API guidelines on what this should look like, so we have a couple of options. The two main ones I can think of:

  • Add a dedicated from_arrow() method, which could be top level (pd.from_arrow(..)) or as class methods (pd.DataFrame.from_arrow(..))
  • Support such objects directly in the main constructors (pd.Dataframe(..))

In pandas itself, we do have a couple of from_.. class methods (from_dict/from_records), but often for objects we also allow in the main constructor (at least for the dict case), but I think the main differentiator is that the specific class methods then have more specialized keyword arguments (and therefore allow a larger variety of input).
So based on that pattern, we could also do both: add a DataFrame.from_arrow() class method, and then also accept such objects in pd.DataFrame(), passing through to from_arrow() (which could have more custom options to control how the conversion from arrow to pandas exactly is done).

Looking at polars, it seems they also have both, but I am not entirely sure about the connection between both. pl.from_arrow already existed but might be more specific for pyarrow? And then pola-rs/polars#17693 added it to the main pl.DataFrame(..) constructor (@kylebarron)

For geopandas, I added a GeoDataFrame.from_arrow() method.

(to be clear, everything said above also applies to Series() / Series.from_arrow() etc)

cc @MarcoGorelli @WillAyd

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions