Mystery: bi-modal benchmarks

Running some asvs and seeing changes in benchmarks that should be unaffected, yet those changes are consistent across runs.

Hunting this down, I'm replicating the most-weird benchmark (asv reports 1.27±0ms vs 789±3μs) in ipython

```
arr = np.ones(10**6, dtype="int8")
ser = pd.Series(arr)

func = ser.sum

%timeit func()
```

The first few runs matched the asv results really closely:

```
In [4]: %timeit func()                                                                                                                           
781 µs ± 552 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- branch
1.27 ms ± 1.82 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- master
```

But running it again just for kicks the results flipped:

```
In [5]: %timeit func()                                                                                                                           
1.27 ms ± 5.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- branch
780 µs ± 4.81 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- master

In [5]: %timeit func()
1.26 ms ± 691 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)   # <-- branch
1.26 ms ± 578 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- master

In [10]: %timeit func()        
1.26 ms ± 818 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- branch
773 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- master
```

Eventually the ~780 number stopped showing up and got replaced a mode between 895-907, but the 1.26-1.27 mode hasn't moved a bit.

No obvious pattern as to which mode will show up in any given run (through they are positively auto-correlated).  But in the asv results there must be something causing the modes to be split between the two branches.

I get the same behavior even if I use the same branch, just different terminal windows.

Looking at this with `%prun -s cumtime for i in range(1000): func()` points at `{method 'reduce' of 'numpy.ufunc' objects}`

```
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.796    0.796 {built-in method builtins.exec}
        1    0.001    0.001    0.796    0.796 <string>:1(<module>)
     1000    0.001    0.000    0.795    0.001 generic.py:10747(sum)
     1000    0.001    0.000    0.794    0.001 generic.py:10489(sum)
     1000    0.002    0.000    0.793    0.001 generic.py:10454(_min_count_stat_function)
     1000    0.004    0.000    0.791    0.001 series.py:4133(_reduce)
     1000    0.004    0.000    0.778    0.001 nanops.py:82(_f)
     1000    0.001    0.000    0.763    0.001 nanops.py:392(new_func)
     1000    0.003    0.000    0.761    0.001 nanops.py:528(nansum)
     1000    0.001    0.000    0.738    0.001 {method 'sum' of 'numpy.ndarray' objects}
     1000    0.000    0.000    0.738    0.001 _methods.py:45(_sum)
     1000    0.737    0.001    0.737    0.001 {method 'reduce' of 'numpy.ufunc' objects}    # <--
     1000    0.003    0.000    0.017    0.000 nanops.py:256(_get_values)
     4000    0.005    0.000    0.010    0.000 _ufunc_config.py:32(seterr)

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.292    1.292 {built-in method builtins.exec}
        1    0.001    0.001    1.292    1.292 <string>:1(<module>)
     1000    0.001    0.000    1.292    0.001 generic.py:10747(sum)
     1000    0.001    0.000    1.291    0.001 generic.py:10489(sum)
     1000    0.002    0.000    1.290    0.001 generic.py:10454(_min_count_stat_function)
     1000    0.004    0.000    1.288    0.001 series.py:4133(_reduce)
     1000    0.004    0.000    1.275    0.001 nanops.py:82(_f)
     1000    0.001    0.000    1.260    0.001 nanops.py:392(new_func)
     1000    0.002    0.000    1.258    0.001 nanops.py:528(nansum)
     1000    0.001    0.000    1.235    0.001 {method 'sum' of 'numpy.ndarray' objects}
     1000    0.000    0.000    1.234    0.001 _methods.py:45(_sum)
     1000    1.234    0.001    1.234    0.001 {method 'reduce' of 'numpy.ufunc' objects}   # <--
     1000    0.003    0.000    0.018    0.000 nanops.py:256(_get_values)
     4000    0.005    0.000    0.010    0.000 _ufunc_config.py:32(seterr)
```

The total difference is 1.292 - 0.796 = 0.496 and the difference in the `reduce` ufunc is 1.234 - 0.737 = 0.497, which is pretty suggestive.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mystery: bi-modal benchmarks #40066

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mystery: bi-modal benchmarks #40066

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions