Skip to content

PERF: regression in DataFrame construction from nested dict #42248

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

#41785 removed some cython code to handle this case (because it could also be handled by existing python code, AFAIU), but this de-duplication caused a big slowdown in one of the benchmarks: https://pandas.pydata.org/speed/pandas/#frame_ctor.FromDicts.time_nested_dict_int64?python=3.8&Cython=0.29.21

It's of course a trade-off between the maintenance cost of having the cython version versus the performance benefit. But if the benchmark is representative, a 5-6x slowdown seems quite a lot for getting rid of a relatively small piece of cython code.

Originally posted by @jorisvandenbossche in #41785 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ConstructorsSeries/DataFrame/Index/pd.array ConstructorsPerformanceMemory or execution speed performanceRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions