Open
Description
On pandas 1.5:
In [2]: pd.Index(['a', 'b']).astype("S3")
Out[2]: Index([b'a', b'b'], dtype='object')
On the main branch:
In [2]: pd.Index(['a', 'b']).astype("S3")
...
File ~/scipy/pandas/pandas/core/indexes/base.py:589, in Index._dtype_to_subclass(cls, dtype)
584 elif issubclass(
585 dtype.type, (str, bool, np.bool_, complex, np.complex64, np.complex128)
586 ):
587 return Index
--> 589 raise NotImplementedError(dtype)
NotImplementedError: |S3
This started to fail a while ago on pyarrow's CI (https://issues.apache.org/jira/browse/ARROW-18394). This comes up if you roundtrip a pandas DataFrame with bytes column names to arrow and back to pandas.
Didn't yet investigate further what might be the change that caused this / whether this was intentional, etc.