Skip to content

API: Index.str follow-ups (extract/get_dummies) #9980

Closed
@sinhrks

Description

@sinhrks

Follow-ups for #9667. Noticed 2 methods which can return DataFrame.

1. Index.str.extract

As shown in docstring, it returns DataFrame when the expression has 2 or more groups.

pd.Series(['a1', 'b2', 'c3']).str.extract('[ab](\d)')
#0      1
#1      2
#2    NaN
# dtype: object

pd.Series(['a1', 'b2', 'c3']).str.extract('([ab])(\d)')
#      0    1
#0    a    1
#1    b    2
#2  NaN  NaN

Currently, Index.str.extract raises an error in both cases. I think 1st case should return Index, and 2nd case should raise understandable error.

pd.Index(['a1', 'b2', 'c3']).str.extract('[ab](\d)')
# AttributeError: 'Index' object has no attribute 'index'

pd.Index(['a1', 'b2', 'c3']).str.extract('([ab])(\d)')
# AttributeError: 'Index' object has no attribute 'empty'

2. Index.str.get_dummies

Because it returns DataFrame, should raise an understandable error.

pd.Index(['a1', 'b2', 'c3']).str.get_dummies()
# AttributeError: 'Index' object has no attribute 'fillna'

CC: @mortada

Metadata

Metadata

Assignees

No one assigned

    Labels

    StringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions