Skip to content

Refactor StringMethods for extension arrays #36216

Closed
@TomAugspurger

Description

@TomAugspurger

This is an issue to track a refactor of StringMethods that'll be necessary for ArrowStringArray to use pyarrow compute algorithms.

We'll need to update StringMethods

  1. Extract the array from the Series / Index
  2. Dispatch the string methods
  3. Wrap the result

So in my proposal, we'll have something like

class StringMethods:
    def __init__(self, data, ...):
        self._array = data.array
        ...

    def upper(self):
        return self._wrap_result(self._array._str.upper())

cc @xhochy. I have a branch started and will hopefully finish it off this week.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ExtensionArrayExtending pandas with custom dtypes or arrays.RefactorInternal refactoring of codeStringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions