Description
A user reported this to Criterion.rs - bheisler/criterion.rs#133
The short version is this - when the user's test crate's benchmarks are run on 1.24.0 or newer, Criterion.rs calculates some values incorrectly even though the code looks correct to me. They claim to have reproduced this behavior on Arch Linux, Windows and Raspbian. I have only been able to confirm it on Windows. Each of us has confirmed this behavior on multiple machines.
I was initially reluctant to call this a miscompilation, but when I started investigating it, any change I made caused the bug to stop happening - inserting println's, or even pushing values into a vector and printing them later caused the bug to disappear. Eventually I tried simply adding a call to test::black_box
, which should have no observable effect on the code except to inhibit certain compiler optimizations. That also caused the bug to stop occurring. It may still be a bug in my code, but if so I can't find it.
I've tried to create a minimal test case, but was unsuccessful. This bug is quite fragile.
Steps to reproduce:
- Clone https://github.com/mbillingr/criterion-test.rs
- Edit the Cargo.toml file to disable the default features for
criterion
(this isn't necessary but will save some compilation time) - With
1.23.0-x86_64-pc-windows-msvc
, runcargo bench --bench my_benchmark -- --verbose
.- Criterion.rs will run two benchmarks and report two iteration counts. The second should be significantly smaller than the first - this is the desired behavior. For example:
Benchmarking fib 1: Collecting 100 samples in estimated 1.0000 s (2469702500 iterations)
...
Benchmarking fib 2: Collecting 100 samples in estimated 1.0000 s (132784700 iterations)
- Switch to 1.24.1 and run the benchmark command again. Note that this time, the second iteration count is roughly the same as the first (this is the unexpected behavior):
Benchmarking fib 1: Collecting 100 samples in estimated 1.0000 s (2518899600 iterations)
...
Benchmarking fib 2: Collecting 100 samples in estimated 1.0000 s (2514900000 iterations)
- Optional: Switch to a nightly compiler and verify that the unexpected behavior persists. Clone the
rustc_fix
branch of https://github.com/japaric/criterion.rs and modify the Cargo.toml file ofcriterion-test.rs
to use that instead. This patch merely enables thetest
feature/crate and inserts one call totest::black_box
inroutine.rs:warm_up
. Verify that the expected behavior is restored.