ENH: Consistent API between `pd.get_dummies()` and `Series.str.get_dummies()`

### Feature Type

- [ ] Adding new functionality to pandas

- [X] Changing existing functionality in pandas

- [ ] Removing existing functionality in pandas


### Problem Description

Compared to `pd.get_dummies()`, `Series.str.get_dummies()` behaves so differently and has much more limited functionality. Such differences would not be user-friendly.

### Feature Description

1. The dtype of the return DataFrame of `Series.str.get_dummies()` should be `bool`, not `int64`.

    ```python
    s = pd.Series(list('abca'))
    s.str.get_dummies()
    ```

    before:

    ```text
       a  b  c
    0  1  0  0
    1  0  1  0
    2  0  0  1
    3  1  0  0
    ```

    after (same as `pd.get_dummies(s)`):

    ```text
           a      b      c
    0   True  False  False
    1  False   True  False
    2  False  False   True
    3   True  False  False
    ```

2. `prefix=`, `prefix_sep=`, `dummy_na=`, `sparse=`, and `dtype=` arguments should be added to `Series.str.get_dummies()`.

    ```python
    s = pd.Series(['a', 'b', np.nan])
    s.str.get_dummies(prefix="dummy", prefix_sep="=", dummy_na=True, dtype=float)
    ```

    after (same as `pd.get_dummies(s, prefix="dummy", prefix_sep="=", dummy_na=True, dtype=float)`):

    ```text
       dummy=a  dummy=b  dummy=nan
    0      1.0      0.0        0.0
    1      0.0      1.0        0.0
    2      0.0      0.0        1.0
    ```

    Note: Among the arguments of `pd.get_dummies()`, the `columns=` argument is obviously not needed for `Series.str.get_dummies()`. Whether `Series.str.get_dummies()` needs a `drop_first=` argument is debatable since `Series.str.get_dummies()` can yield `True` in multiple columns unlike `pd.get_dummies()`.


### Alternative Solutions

While there are countless alternatives to obtaining DataFrames that yield the same result, there is no alternative that would bring consistency to the two methods. The only alternative might be to simply deprecate `Series.str.get_dummies()`.

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Consistent API between `pd.get_dummies()` and `Series.str.get_dummies()` #59235

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ENH: Consistent API between pd.get_dummies() and Series.str.get_dummies() #59235

Description

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

ENH: Consistent API between `pd.get_dummies()` and `Series.str.get_dummies()` #59235