Skip to content

Performance regression in timeseries.SortIndex.time_sort_index #33917

Closed
@TomAugspurger

Description

@TomAugspurger
import pandas as pd
import numpy as np

N = 10 ** 5
idx = pd.date_range(start="1/1/2000", periods=N, freq="s")
s = pd.Series(np.random.randn(N), index=idx)
%timeit s.sort_index()
# 1.0.2
108 µs ± 8.27 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# master
225 µs ± 8.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

According to https://pandas.pydata.org/speed/pandas/index.html#timeseries.SortIndex.time_sort_index?p-monotonic=True&commits=f683473a156f032a64a1d7edcebde21c42a8702d-085860a49f3a87aa4e24b3115b50b85c4b3c5676, the first slow commit is #33755, which just bumps Cython in numpydev... So probably not actually that commit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypePerformanceMemory or execution speed performanceRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions