Merge pull request #1803 from pnkfelix/triage-2024-01-16

Mark-Simulacrum · web-flow · commit 74e554626c45 · 2024-01-17T07:58:22.000-05:00
Triage 2024 01 16
diff --git a/triage/2024-01-16.md b/triage/2024-01-16.md
@@ -0,0 +1,202 @@
+# 2024-01-16 Triage Log
+
+This week had some small regressions that did not warrant further investigation,
+several of which were dismissed as being noise/blips in the data. There were
+also a number of gains. (Don't get exicited about that 20.6% improvement, its an
+measurement artifact from a temporary blip in the PR that immediately preceded
+this week's triage.)
+
+Triage done by **@pnkfelix**.
+Revision range: [76101eec..f9c2421a](https://perf.rust-lang.org/?start=76101eecbe9aa80753664bbe637ad06d1925f315&end=f9c2421a2a6e34f3756900dd7d600704c08bfccb&absolute=false&stat=instructions%3Au)
+
+**Summary**:
+
+| (instructions:u)                   | mean  | range           | count |
+|:----------------------------------:|:-----:|:---------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | 0.7%  | [0.6%, 0.7%]    | 2     |
+| Regressions ❌ <br /> (secondary)  | 3.1%  | [0.8%, 4.1%]    | 9     |
+| Improvements ✅ <br /> (primary)   | -1.2% | [-20.6%, -0.2%] | 133   |
+| Improvements ✅ <br /> (secondary) | -0.8% | [-7.3%, -0.1%]  | 31    |
+| All ❌✅ (primary)                 | -1.2% | [-20.6%, 0.7%]  | 135   |
+
+
+3 Regressions, 5 Improvements, 5 Mixed; 3 of them in rollups
+55 artifact comparisons made in total
+
+#### Regressions
+
+Rollup of 10 pull requests [#119754](https://github.com/rust-lang/rust/pull/119754) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=ca663b06c5492ac2dde5e53cd11579fa8e4d68bd&end=d6affcf520091fd0f48df1a2b6bfcb9ef48e0f40&stat=instructions:u)
+
+| (instructions:u)                   | mean | range        | count |
+|:----------------------------------:|:----:|:------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -    | -            | 0     |
+| Regressions ❌ <br /> (secondary)  | 2.4% | [2.4%, 2.4%] | 2     |
+| Improvements ✅ <br /> (primary)   | -    | -            | 0     |
+| Improvements ✅ <br /> (secondary) | -    | -            | 0     |
+| All ❌✅ (primary)                 | -    | -            | 0     |
+
+* The 2 regressing (and secondary) benchmarks are tt-muncher debug {incr-full, full}. Its not transient.
+* I've skimmed over the list of PR's in the rollup. None of them are obvious culprits here. I looked at the ones related to debuginfo (#118903) and to code-coverage (#119033 and #119681), but none of those seem likely to be to blame here
+* Since this only affects a secondary benchmark, and only the instruction count (e.g. not cpu-clock:u nor wall-time for these two benchmarks), I do not think its worth further investigation and I'm going to mark it as triaged.
+
+Exhaustiveness: use an `Option` instead of allocating fictitious patterns [#119688](https://github.com/rust-lang/rust/pull/119688) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=a2d9d73e608f1b24eba840c4fd2d68dbe3b65e01&end=0a8923361ec2a37fa341292c029ef7c6d0405d4b&stat=instructions:u)
+
+| (instructions:u)                   | mean | range        | count |
+|:----------------------------------:|:----:|:------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -    | -            | 0     |
+| Regressions ❌ <br /> (secondary)  | 3.8% | [3.6%, 4.1%] | 6     |
+| Improvements ✅ <br /> (primary)   | -    | -            | 0     |
+| Improvements ✅ <br /> (secondary) | -    | -            | 0     |
+| All ❌✅ (primary)                 | -    | -            | 0     |
+
+* This impacted the 6 variants of match-stress {incr-full,full} x {check,debug,opt}
+* I think the impact on match-stress was probably well-anticipated, and within a reasonable range for a stress-test benchmark.
+* Note that #119688 was a precursor to some further cleanup code (namely to remove the use of a local-arena within exhaustiveness checking).
+* Marking as triaged.
+
+never patterns: Check bindings wrt never patterns [#119610](https://github.com/rust-lang/rust/pull/119610) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=bfcc027a751595ba290c554f47907eaa3779f798&end=714b29a17ff5fa727c794bbb60bfd335f8e75d42&stat=instructions:u)
+
+| (instructions:u)                   | mean | range        | count |
+|:----------------------------------:|:----:|:------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | 0.3% | [0.3%, 0.4%] | 3     |
+| Regressions ❌ <br /> (secondary)  | -    | -            | 0     |
+| Improvements ✅ <br /> (primary)   | -    | -            | 0     |
+| Improvements ✅ <br /> (secondary) | -    | -            | 0     |
+| All ❌✅ (primary)                 | 0.3% | [0.3%, 0.4%] | 3     |
+
+* This impacted 3 variants of unicode-normalization-0.1.19: debug incr-unchanged and check {incr-unchanged, incr-patched:println}.
+* Interestingly, during two different try runs, those three variants were found to have improved by similar amounts by this PR.
+* there's some weird interaction between that benchmark and the code paths impacted by this PR, and I do not think its worth investing effort in further investigation.
+* marking as triaged.
+
+#### Improvements
+
+macro_rules: Add an expansion-local cache to span marker [#119693](https://github.com/rust-lang/rust/pull/119693) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=76101eecbe9aa80753664bbe637ad06d1925f315&end=0ee9cfd54db7b5f4be35f026588904500c866196&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range           | count |
+|:----------------------------------:|:-----:|:---------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -     | -               | 0     |
+| Regressions ❌ <br /> (secondary)  | -     | -               | 0     |
+| Improvements ✅ <br /> (primary)   | -1.4% | [-20.5%, -0.2%] | 80    |
+| Improvements ✅ <br /> (secondary) | -0.8% | [-1.9%, -0.3%]  | 16    |
+| All ❌✅ (primary)                 | -1.4% | [-20.5%, -0.2%] | 80    |
+
+* the bitmaps changes (-20.5%, -17.9%, -13.1%) are all artifacts of returning to normal after a blip in the previous PR.
+
+A more efficient slice comparison implementation for T: !BytewiseEq [#116846](https://github.com/rust-lang/rust/pull/116846) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=ae9d24de80b00b4158d1a29a212a6b02aeda0e75&end=190f4c96116a3b59b7de4881cfec544be0246d84&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -     | -              | 0     |
+| Regressions ❌ <br /> (secondary)  | -     | -              | 0     |
+| Improvements ✅ <br /> (primary)   | -0.5% | [-0.9%, -0.2%] | 15    |
+| Improvements ✅ <br /> (secondary) | -0.6% | [-0.6%, -0.6%] | 1     |
+| All ❌✅ (primary)                 | -0.5% | [-0.9%, -0.2%] | 15    |
+
+* it is too bad that work in PR #100124 stalled.
+
+Remove a large amount of leb128-coded integers [#119791](https://github.com/rust-lang/rust/pull/119791) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=d73bd3fb3ba312f3e6b5af4d56d1161d37b71620&end=68acb393c5d2cff049b41981e35217a7e630f63a&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -     | -              | 0     |
+| Regressions ❌ <br /> (secondary)  | 1.1%  | [1.1%, 1.1%]   | 1     |
+| Improvements ✅ <br /> (primary)   | -0.3% | [-0.3%, -0.2%] | 5     |
+| Improvements ✅ <br /> (secondary) | -0.3% | [-0.5%, -0.1%] | 12    |
+| All ❌✅ (primary)                 | -0.3% | [-0.3%, -0.2%] | 5     |
+
+* the 1.1% hit is to deep-vector debug full. It may be transient; the history is pretty up-and-down at the time of this PR, and has settled at a lower level than where it was when this PR landed.
+* in any case, the gains elsewhere, especially bootstrap, outweigh the loss to that one secondary benchmark. (Which ... I guess is what the rustc-perf bot now computes as well, since it categorized this as an Improvement rather than Mixed?)
+
+Exhaustiveness: track overlapping ranges precisely [#119396](https://github.com/rust-lang/rust/pull/119396) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=bfd799f1a5a86d16e6b8caa2857bcb4aac6e0174&end=174e73a3f6df6f96ab453493796e461164dea94a&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -     | -              | 0     |
+| Regressions ❌ <br /> (secondary)  | -     | -              | 0     |
+| Improvements ✅ <br /> (primary)   | -0.4% | [-1.7%, -0.2%] | 32    |
+| Improvements ✅ <br /> (secondary) | -     | -              | 0     |
+| All ❌✅ (primary)                 | -0.4% | [-1.7%, -0.2%] | 32    |
+
+
+Rollup of 6 pull requests [#119889](https://github.com/rust-lang/rust/pull/119889) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=174e73a3f6df6f96ab453493796e461164dea94a&end=ce1f2ccf5a5ac9343623bd115a05e4151d93af0d&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -     | -              | 0     |
+| Regressions ❌ <br /> (secondary)  | -     | -              | 0     |
+| Improvements ✅ <br /> (primary)   | -1.8% | [-1.9%, -1.7%] | 4     |
+| Improvements ✅ <br /> (secondary) | -4.3% | [-7.4%, -1.3%] | 2     |
+| All ❌✅ (primary)                 | -1.8% | [-1.9%, -1.7%] | 4     |
+
+
+#### Mixed
+
+Support async recursive calls (as long as they have indirection) [#117703](https://github.com/rust-lang/rust/pull/117703) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=387e7a5e42ac074e79a14361e82702a229a6aac8&end=dc641039d2b3f5c0894694e4b45f7c3951030685&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | -     | -              | 0     |
+| Regressions ❌ <br /> (secondary)  | 0.2%  | [0.2%, 0.2%]   | 1     |
+| Improvements ✅ <br /> (primary)   | -0.3% | [-0.4%, -0.3%] | 3     |
+| Improvements ✅ <br /> (secondary) | -     | -              | 0     |
+| All ❌✅ (primary)                 | -0.3% | [-0.4%, -0.3%] | 3     |
+
+* this is weird, it looks like an inverse blip occurred on the preceding PR, where tt-muncher check incr-unchanged had a single point with -0.2% instruction-count, and then it preceding to "return to normal" on the succeeding PRs.
+* (Its harder for me to explain away "inverse blips" ...)
+* but at the same time, this does not seem like a significant regression by our usual metrics.
+* marking as triaged.
+
+Rollup of 9 pull requests [#119767](https://github.com/rust-lang/rust/pull/119767) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=be00c5a9b89161b7f45ba80340f709e8e41122f9&end=5876c8cdfd3df742c334d6447d44d760c77103b6&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | 1.3%  | [0.4%, 2.3%]   | 2     |
+| Regressions ❌ <br /> (secondary)  | 0.9%  | [0.5%, 1.2%]   | 2     |
+| Improvements ✅ <br /> (primary)   | -     | -              | 0     |
+| Improvements ✅ <br /> (secondary) | -0.3% | [-0.3%, -0.3%] | 1     |
+| All ❌✅ (primary)                 | 1.3%  | [0.4%, 2.3%]   | 2     |
+
+* primary regressions: syn opt-full regressed by 2.3%, bitmaps check-incr-full by 0.35%. secondary regressions: coercions debug-full by 1.23%, ctfe-stress check-full by 0.51%
+* from the overall history, it seems like syn opt-full returned to "normal" with later PRs that don't necessarily seem like they would have affected syn (e.g. PR #117449). bitmap check-incr-full's trend is likewise downward after this point.
+* marking as triaged
+
+Add assume into `NonZeroIntX::get` [#119452](https://github.com/rust-lang/rust/pull/119452) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=ce1f2ccf5a5ac9343623bd115a05e4151d93af0d&end=2319be8e265dd19973574eb581d28297baf44b11&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | 0.5%  | [0.4%, 0.9%]   | 4     |
+| Regressions ❌ <br /> (secondary)  | -     | -              | 0     |
+| Improvements ✅ <br /> (primary)   | -0.7% | [-0.7%, -0.7%] | 1     |
+| Improvements ✅ <br /> (secondary) | -     | -              | 0     |
+| All ❌✅ (primary)                 | 0.3%  | [-0.7%, 0.9%]  | 5     |
+
+* scottmcm writes: "Instructions have a couple red in instruction counts for opt, but that's entirely reasonable for something intended to enable optimizations. Notably, the cycles are green, with no regressions. So I think this is fine."
+* marking as triaged
+
+Avoid some redundant work in GVN [#119439](https://github.com/rust-lang/rust/pull/119439) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=714b29a17ff5fa727c794bbb60bfd335f8e75d42&end=f9c2421a2a6e34f3756900dd7d600704c08bfccb&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | 0.8%  | [0.6%, 0.9%]   | 4     |
+| Regressions ❌ <br /> (secondary)  | -     | -              | 0     |
+| Improvements ✅ <br /> (primary)   | -0.7% | [-1.1%, -0.5%] | 4     |
+| Improvements ✅ <br /> (secondary) | -0.4% | [-0.4%, -0.4%] | 1     |
+| All ❌✅ (primary)                 | 0.0%  | [-1.1%, 0.9%]  | 8     |
+
+* primary regressions are regex-1.5.5 debug-full, opt-incr-patched:Job, incr-full, and exa opt-full.
+* the exa regression looks like a blip. The regex ones were predicted during a try run for the PR. I assume they were deemed acceptable as they are offset improvements elsewhere (or dismissed as noise?)
+* marking as triaged.
+
+Sandwich MIR optimizations between DSE. [#119672](https://github.com/rust-lang/rust/pull/119672) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=f9c2421a2a6e34f3756900dd7d600704c08bfccb&end=fa0dc208d0a34027c1d3cca7d47975d8238bcfde&stat=instructions:u)
+
+| (instructions:u)                   | mean  | range          | count |
+|:----------------------------------:|:-----:|:--------------:|:-----:|
+| Regressions ❌ <br /> (primary)    | 0.7%  | [0.2%, 1.4%]   | 14    |
+| Regressions ❌ <br /> (secondary)  | 0.5%  | [0.2%, 2.7%]   | 14    |
+| Improvements ✅ <br /> (primary)   | -1.0% | [-2.2%, -0.2%] | 31    |
+| Improvements ✅ <br /> (secondary) | -0.9% | [-2.2%, -0.2%] | 10    |
+| All ❌✅ (primary)                 | -0.4% | [-2.2%, 1.4%]  | 45    |
+
+* already marked as triaged by @lqd with the comment "As seen in the previous runs: some nice wins on bigger benchmarks, and overall gains outweigh the few losses."
+