Open
Description
Code Sample, a copy-pastable example
import pandas as pd
from pandas import StringDtype
from pandas.core.arrays import StringArray
from pandas.core.dtypes.dtypes import register_extension_dtype
@register_extension_dtype
class MyExtensionDtype(StringDtype):
name = 'my_extension'
def __repr__(self) -> str:
return "MyExtensionDtype"
@classmethod
def construct_array_type(cls) -> "Type[MyExtensionStringArray]":
return MyExtensionStringArray
class MyExtensionStringArray(StringArray):
def __init__(self, values, copy=False):
super().__init__(values, copy)
self._dtype = MyExtensionDtype()
series = pd.Series(["test", "test2"], dtype="my_extension")
assert series.dtype == 'my_extension'
Results in
assert dtype == "string" AssertionError
Problem description
It should be possible to extend the StringDtype/StringArray for users to design efficient subtypes. I believe that the the AssertionError is a bug and not intended, as pandas wants to have extensible dtypes, because there is the ExtensionDtype.
Expected Output
The code above should pass without errors.
PR with fix on it's way.
Output of pd.show_versions()
pandas v1.0.3