Skip to content

PERF: concat perf #23362

Closed
Closed
@TomAugspurger

Description

@TomAugspurger

For Series[period], pd.concat is about 6x slower than PeriodArray._concat_same_type. There's always going to be some overhead, but I wonder how much we can narrow this.

In [1]: import numpy as np
   ...: import pandas as pd
   ...:
   ...: a = np.random.randint(2000, 2100, size=1000)
   ...: b = np.random.randint(2000, 2100, size=1000)
   ...:
   ...: x = pd.core.arrays.period_array(a, freq='B')
   ...: y = pd.core.arrays.period_array(b, freq='B')
   ...:
   ...: s = pd.Series(x)
   ...: t = pd.Series(y)


In [2]: %timeit pd.concat([s, t], ignore_index=True)
523 µs ± 22.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [3]: %timeit x._concat_same_type([x, y])
90.1 µs ± 948 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Closing CandidateMay be closeable, needs more eyeballsPerformanceMemory or execution speed performancePeriodPeriod data typeReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions