Description
Currently we have a MIN_ELEMENTS set at 10,000:
pandas/pandas/core/computation/expressions.py
Lines 42 to 43 in 00a6224
However, I have been noticing while running lots of performance comparisons recently, that numexpr still seems to show some overhead at that array size compared to numpy.
I did a few specific timings for a few ops comparing numpy and numexpr for a set of different array sizes:
Code used to create the plot
import operator
import numpy as np
import pandas as pd
import numexpr as ne
import seaborn as sns
results = []
for s in [10**3, 10**4, 10**5, 10**6, 10**7, 10**8]:
arr1 = np.random.randn(s)
arr2 = np.random.randn(s)
for op_str, op in [("+", operator.add), ("*", operator.mul), ("==", operator.eq), ("<=", operator.le)]:
res_ne = %timeit -o ne.evaluate(f"a {op_str} b", local_dict={"a": arr1, "b": arr2}, casting="safe")
res_np = %timeit -o op(arr1, arr2)
results.append({"size": s, "op": op_str, "engine": "numepxr", "timing": res_ne.average, "timing_stdev": res_np.stdev})
results.append({"size": s, "op": op_str, "engine": "numpy", "timing": res_np.average, "timing_stdev": res_np.stdev})
df = pd.DataFrame(results)
fig = sns.relplot(data=df, x="size", y="timing", hue="engine", col="op", kind="line", col_wrap=2)
fig.set(xscale='log', yscale='log')
So in general, numexpr is not that much faster for the large arrays. But specifically, it still has a significant overhead compared to numpy up to 1e5 - 1e6, while the current minimum number of elements is 1e4.
Further, this might depend on your specific hardware and versions etc (this was run on my linux laptop with 8 cores, using latest versions of numpy and numexpr). So it is always hard to give a default suitable for all.
But based on the analysis above, I would propose raising the minimum from 1e4 to 1e5 (or maybe even 1e6).