Skip to content

PERF: Series + range #51502

Closed
Closed
@jbrockmendel

Description

@jbrockmendel
In [5]: ser = pd.Series(0, index=range(10_000_000))

In [6]: %timeit ser + range(len(ser))
737 ms ± 15.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit ser + np.arange(len(ser))
34.5 ms ± 526 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Somewhere along the line we do a sub-optimal conversion of the range to an ndarray. We can improve perf by using np.arange(obj) instead of np.asarray(obj).

Discovered while tracking down slow tests, in particular test_resample_equivalent_offsets takes a little more than 1% of the test suite runtime and most of that is in adding a range to a Series.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs TriageIssue that has not been reviewed by a pandas team memberPerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions