Skip to content

Commit 84543af

Browse files
committed
ENH: support Arrow PyCapsule Interface on Series for export
1 parent 0c24b20 commit 84543af

File tree

3 files changed

+61
-0
lines changed

3 files changed

+61
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ Other enhancements
4343
- Users can globally disable any ``PerformanceWarning`` by setting the option ``mode.performance_warnings`` to ``False`` (:issue:`56920`)
4444
- :meth:`Styler.format_index_names` can now be used to format the index and column names (:issue:`48936` and :issue:`47489`)
4545
- :class:`.errors.DtypeWarning` improved to include column names when mixed data types are detected (:issue:`58174`)
46+
- :class:`Series` now supports the Arrow PyCapsule Interface for export (:issue:`59518`)
4647
- :func:`DataFrame.to_excel` argument ``merge_cells`` now accepts a value of ``"columns"`` to only merge :class:`MultiIndex` column header header cells (:issue:`35384`)
4748
- :meth:`DataFrame.corrwith` now accepts ``min_periods`` as optional arguments, as in :meth:`DataFrame.corr` and :meth:`Series.corr` (:issue:`9490`)
4849
- :meth:`DataFrame.cummin`, :meth:`DataFrame.cummax`, :meth:`DataFrame.cumprod` and :meth:`DataFrame.cumsum` methods now have a ``numeric_only`` parameter (:issue:`53072`)

pandas/core/series.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
from pandas._libs.lib import is_range_indexer
3535
from pandas.compat import PYPY
3636
from pandas.compat._constants import REF_COUNT
37+
from pandas.compat._optional import import_optional_dependency
3738
from pandas.compat.numpy import function as nv
3839
from pandas.errors import (
3940
ChainedAssignmentError,
@@ -558,6 +559,39 @@ def _init_dict(
558559

559560
# ----------------------------------------------------------------------
560561

562+
def __arrow_c_stream__(self, requested_schema=None):
563+
"""
564+
Export the pandas Series as an Arrow C stream PyCapsule.
565+
566+
This relies on pyarrow to convert the pandas Series to the Arrow
567+
format (and follows the default behaviour of ``pyarrow.Array.from_pandas``
568+
in its handling of the index, i.e. to ignore it).
569+
This conversion is not necessarily zero-copy.
570+
571+
Parameters
572+
----------
573+
requested_schema : PyCapsule, default None
574+
The schema to which the dataframe should be casted, passed as a
575+
PyCapsule containing a C ArrowSchema representation of the
576+
requested schema.
577+
578+
Returns
579+
-------
580+
PyCapsule
581+
"""
582+
pa = import_optional_dependency("pyarrow", min_version="16.0.0")
583+
if requested_schema is not None:
584+
# todo: how should this be supported?
585+
msg = (
586+
"Passing `requested_schema` to `Series.__arrow_c_stream__` is not yet "
587+
"supported"
588+
)
589+
raise NotImplementedError(msg)
590+
ca = pa.chunked_array([pa.Array.from_pandas(self, type=requested_schema)])
591+
return ca.__arrow_c_stream__()
592+
593+
# ----------------------------------------------------------------------
594+
561595
@property
562596
def _constructor(self) -> type[Series]:
563597
return Series
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
import ctypes
2+
3+
import pytest
4+
5+
import pandas.util._test_decorators as td
6+
7+
import pandas as pd
8+
9+
pa = pytest.importorskip("pyarrow")
10+
11+
12+
@td.skip_if_no("pyarrow", min_version="16.0")
13+
def test_series_arrow_interface():
14+
s = pd.Series([1, 4, 2])
15+
16+
capsule = s.__arrow_c_stream__()
17+
assert (
18+
ctypes.pythonapi.PyCapsule_IsValid(
19+
ctypes.py_object(capsule), b"arrow_array_stream"
20+
)
21+
== 1
22+
)
23+
24+
ca = pa.chunked_array(s)
25+
expected = pa.chunked_array([[1, 4, 2]])
26+
assert ca.equals(expected)

0 commit comments

Comments
 (0)