Skip to content

ENH/PERF: dispatch is_monotonic_increasing / decreasing ? #56619

Open
@lukemanley

Description

@lukemanley

Is it worth dispatching is_monotonic_increasing / is_monotonic_decreasing for EAs?

The cython implemention is early-stopping, but the benefit disappears if the data needs to be copied into an object array as in the example below:

import pandas as pd

values = [f"val_{i:07}" for i in range(1_000_000)]
ser = pd.Series(values, dtype="string[pyarrow_numpy]")

%timeit ser.is_monotonic_increasing
# 219 ms ± 20.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit pc.all(pc.greater_equal(ser.array._pa_array[1:], ser.array._pa_array[:-1]))
# 19.2 ms ± 585 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

random order example with cython early-stopping:

ser2 = ser.sample(frac=1.0)

%timeit ser2.is_monotonic_increasing
# 152 ms ± 4.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit pc.all(pc.greater_equal(ser2.array._pa_array[1:], ser2.array._pa_array[:-1]))
# 15 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ExtensionArrayExtending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further actionPerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions