Closed
Description
When a PR lands and is measured, if it improves or regresses performance by more than a certain amount, it would be good for the results to be automatically added in a comment on the PR. That way we'll get faster feedback to PR authors.
Choosing the threshold of significance is the hard part. When doing manual perf triage I usually record ones where at least one run changed by more than 1%. But occasionally I ignore ones, because some of our benchmarks are noisy. So maybe we should require that 3 or more runs change by 1% or more.