Skip to content

ENH: Pandas Tensor Data Type #59006

Open
Open
@bionicles

Description

@bionicles

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Reviewing Arrow docs link from @WillAyd, spotted this

https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor

Tensor is exactly what I'm talking about in Additional Context [1] and would enable Pandas users to have a column datatype for big blocks of some underlying type

Feature Description

Support Arrow Tensor in Pandas

Python
https://arrow.apache.org/docs/python/generated/pyarrow.Tensor.html#pyarrow.Tensor

Rust
https://github.com/apache/arrow-rs/blob/3715d5447e468a5a4dc631ae9aafec706c57aa20/arrow/src/tensor.rs#L115

Alternative Solutions

just make everything an "object":

>>> import numpy as np
>>> import pandas as pd
>>> x = {'hello': 'world'}
>>> y = np.ones(3)
>>> df = pd.DataFrame({'X': [x], 'Y': [y]})
>>> df
                    X                Y
0  {'hello': 'world'}  [1.0, 1.0, 1.0]
>>> df.dtypes
X    object
Y    object
dtype: object

Additional Context

[1] #58455 (comment) onward

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementExtensionArrayExtending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions