Skip to content

Commit bb4ab4f

Browse files
authored
ENH: support Arrow PyCapsule Interface on Series for export (#59587)
* ENH: support Arrow PyCapsule Interface on Series for export * simplify * simplify
1 parent d31aa83 commit bb4ab4f

File tree

3 files changed

+51
-0
lines changed

3 files changed

+51
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ Other enhancements
4343
- Users can globally disable any ``PerformanceWarning`` by setting the option ``mode.performance_warnings`` to ``False`` (:issue:`56920`)
4444
- :meth:`Styler.format_index_names` can now be used to format the index and column names (:issue:`48936` and :issue:`47489`)
4545
- :class:`.errors.DtypeWarning` improved to include column names when mixed data types are detected (:issue:`58174`)
46+
- :class:`Series` now supports the Arrow PyCapsule Interface for export (:issue:`59518`)
4647
- :func:`DataFrame.to_excel` argument ``merge_cells`` now accepts a value of ``"columns"`` to only merge :class:`MultiIndex` column header header cells (:issue:`35384`)
4748
- :meth:`DataFrame.corrwith` now accepts ``min_periods`` as optional arguments, as in :meth:`DataFrame.corr` and :meth:`Series.corr` (:issue:`9490`)
4849
- :meth:`DataFrame.cummin`, :meth:`DataFrame.cummax`, :meth:`DataFrame.cumprod` and :meth:`DataFrame.cumsum` methods now have a ``numeric_only`` parameter (:issue:`53072`)

pandas/core/series.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
from pandas._libs.lib import is_range_indexer
3535
from pandas.compat import PYPY
3636
from pandas.compat._constants import REF_COUNT
37+
from pandas.compat._optional import import_optional_dependency
3738
from pandas.compat.numpy import function as nv
3839
from pandas.errors import (
3940
ChainedAssignmentError,
@@ -558,6 +559,32 @@ def _init_dict(
558559

559560
# ----------------------------------------------------------------------
560561

562+
def __arrow_c_stream__(self, requested_schema=None):
563+
"""
564+
Export the pandas Series as an Arrow C stream PyCapsule.
565+
566+
This relies on pyarrow to convert the pandas Series to the Arrow
567+
format (and follows the default behaviour of ``pyarrow.Array.from_pandas``
568+
in its handling of the index, i.e. to ignore it).
569+
This conversion is not necessarily zero-copy.
570+
571+
Parameters
572+
----------
573+
requested_schema : PyCapsule, default None
574+
The schema to which the dataframe should be casted, passed as a
575+
PyCapsule containing a C ArrowSchema representation of the
576+
requested schema.
577+
578+
Returns
579+
-------
580+
PyCapsule
581+
"""
582+
pa = import_optional_dependency("pyarrow", min_version="16.0.0")
583+
ca = pa.chunked_array([pa.Array.from_pandas(self, type=requested_schema)])
584+
return ca.__arrow_c_stream__(requested_schema)
585+
586+
# ----------------------------------------------------------------------
587+
561588
@property
562589
def _constructor(self) -> type[Series]:
563590
return Series
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
import ctypes
2+
3+
import pytest
4+
5+
import pandas as pd
6+
7+
pa = pytest.importorskip("pyarrow", minversion="16.0")
8+
9+
10+
def test_series_arrow_interface():
11+
s = pd.Series([1, 4, 2])
12+
13+
capsule = s.__arrow_c_stream__()
14+
assert (
15+
ctypes.pythonapi.PyCapsule_IsValid(
16+
ctypes.py_object(capsule), b"arrow_array_stream"
17+
)
18+
== 1
19+
)
20+
21+
ca = pa.chunked_array(s)
22+
expected = pa.chunked_array([[1, 4, 2]])
23+
assert ca.equals(expected)

0 commit comments

Comments
 (0)