Skip to content

PERF: regression in getattr for IntervalIndex #30742

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

Master:

In [14]: idx = pd.interval_range(0, 1000, 1000) 

In [15]: %timeit getattr(idx, '_ndarray_values', idx)
1.29 ms ± 30.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [16]: %timeit idx.closed
321 ns ± 2.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

while on 0.25.3:

In [13]: idx = pd.interval_range(0, 1000, 1000) 

In [14]: %timeit getattr(idx, '_ndarray_values', idx)  
90.5 ns ± 2.09 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [15]: %timeit idx.closed 
105 ns ± 1.61 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

(just checked a few attributes, didn't check if it is related to those specific ones or getattr in general)

I think this is a cause / one of the causes of several regressions that can currently be seen at https://pandas.pydata.org/speed/pandas/ (eg https://pandas.pydata.org/speed/pandas/#reshape.Cut.time_cut_timedelta?p-bins=1000&commits=6efc2379-b9de33e3)

Metadata

Metadata

Assignees

No one assigned

    Labels

    IntervalInterval data typePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions