Skip to content

[QST] What should ExtensionDtype.type return? #35291

Closed
@shwina

Description

@shwina

Greetings, Pandas devs! cuDF is building out additional dtypes such as cudf.CategoricalDtype and cudf.ListDtype based on pd.ExtensionDtype, and this is one question that came up.

The documentation states:

It’s expected ExtensionArray[item] returns an instance of ExtensionDtype.type for scalar item, assuming that value is valid (not NA). NA values do not need to be instances of type.

However, I note that pd.CategoricalDtype for instance does not adhere to this:

In [47]: import pandas as pd

In [48]: a = pd.Series(['a', 'b'], dtype='category')

In [49]: type(a[0])
Out[49]: str

In [50]: type(a.array[0])
Out[50]: str

In [51]: isinstance(a.array, pd.api.extensions.ExtensionArray)
Out[51]: True

In [52]: isinstance(a.dtype, pd.api.extensions.ExtensionDtype)
Out[52]: True

On the other hand, NumPy defines dtype.type somewhat differently:

The type object used to instantiate a scalar of this data-type.

Would love any insights as to the appropriate return value of .type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions