Description
We should have performance tests that are "not allowed to get worse" in the testsuite. The operation should involve reading a .json file with the last run values, comparing to current-run values, failing if any regress, and rewriting the file if any advanced. Buildbot can potentially deposit the .json file before each build if it's missing, and harvest it after for display / plotting, and users can just work with the one(s) in their own workspaces.
This requires either objective performance criteria or fuzzy forms of comparison. Prefer objective when possible. valgrind --tool=lackey
and perf stat
can gather number of instructions retired, which tends to be much more stable than clock time.
Probably easier to base this on metrics gathered from #6810