Skip to content

PERF: make Categorical.searchsorted not require ordered=True #21667

Closed
@topper-123

Description

@topper-123

searchsorted requires that the searched object is (monotonically) sorted to produce correct results. Orderedness is neither a necessary or sufficient condition to make searchsorted work correctly.

Categorical.searchsorted has a hard check for if the Categorical is ordered:

def searchsorted(self, value, side='left', sorter=None):
if not self.ordered:
raise ValueError("Categorical not ordered\nyou can use "
".as_ordered() to change the Categorical to an "
"ordered one")
from pandas.core.series import Series

This is too strict, as unordered but sorted Categoricals could also benefit from using searchsorted.

I propose removing this check and (like for non-categoricals) let the user have the responsibility to ensure that the Categorical is sorted correctly. This would allow very quick lookups in all sorted Categoricals, whether they're ordered or not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions