|
| 1 | +# 2024-01-16 Triage Log |
| 2 | + |
| 3 | +This week had some small regressions that did not warrant further investigation, |
| 4 | +several of which were dismissed as being noise/blips in the data. There were |
| 5 | +also a number of gains. (Don't get exicited about that 20.6% improvement, its an |
| 6 | +measurement artifact from a temporary blip in the PR that immediately preceded |
| 7 | +this week's triage.) |
| 8 | + |
| 9 | +Triage done by **@pnkfelix**. |
| 10 | +Revision range: [76101eec..f9c2421a](https://perf.rust-lang.org/?start=76101eecbe9aa80753664bbe637ad06d1925f315&end=f9c2421a2a6e34f3756900dd7d600704c08bfccb&absolute=false&stat=instructions%3Au) |
| 11 | + |
| 12 | +**Summary**: |
| 13 | + |
| 14 | +| (instructions:u) | mean | range | count | |
| 15 | +|:----------------------------------:|:-----:|:---------------:|:-----:| |
| 16 | +| Regressions ❌ <br /> (primary) | 0.7% | [0.6%, 0.7%] | 2 | |
| 17 | +| Regressions ❌ <br /> (secondary) | 3.1% | [0.8%, 4.1%] | 9 | |
| 18 | +| Improvements ✅ <br /> (primary) | -1.2% | [-20.6%, -0.2%] | 133 | |
| 19 | +| Improvements ✅ <br /> (secondary) | -0.8% | [-7.3%, -0.1%] | 31 | |
| 20 | +| All ❌✅ (primary) | -1.2% | [-20.6%, 0.7%] | 135 | |
| 21 | + |
| 22 | + |
| 23 | +3 Regressions, 5 Improvements, 5 Mixed; 3 of them in rollups |
| 24 | +55 artifact comparisons made in total |
| 25 | + |
| 26 | +#### Regressions |
| 27 | + |
| 28 | +Rollup of 10 pull requests [#119754](https://github.com/rust-lang/rust/pull/119754) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=ca663b06c5492ac2dde5e53cd11579fa8e4d68bd&end=d6affcf520091fd0f48df1a2b6bfcb9ef48e0f40&stat=instructions:u) |
| 29 | + |
| 30 | +| (instructions:u) | mean | range | count | |
| 31 | +|:----------------------------------:|:----:|:------------:|:-----:| |
| 32 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 33 | +| Regressions ❌ <br /> (secondary) | 2.4% | [2.4%, 2.4%] | 2 | |
| 34 | +| Improvements ✅ <br /> (primary) | - | - | 0 | |
| 35 | +| Improvements ✅ <br /> (secondary) | - | - | 0 | |
| 36 | +| All ❌✅ (primary) | - | - | 0 | |
| 37 | + |
| 38 | +* The 2 regressing (and secondary) benchmarks are tt-muncher debug {incr-full, full}. Its not transient. |
| 39 | +* I've skimmed over the list of PR's in the rollup. None of them are obvious culprits here. I looked at the ones related to debuginfo (#118903) and to code-coverage (#119033 and #119681), but none of those seem likely to be to blame here |
| 40 | +* Since this only affects a secondary benchmark, and only the instruction count (e.g. not cpu-clock:u nor wall-time for these two benchmarks), I do not think its worth further investigation and I'm going to mark it as triaged. |
| 41 | + |
| 42 | +Exhaustiveness: use an `Option` instead of allocating fictitious patterns [#119688](https://github.com/rust-lang/rust/pull/119688) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=a2d9d73e608f1b24eba840c4fd2d68dbe3b65e01&end=0a8923361ec2a37fa341292c029ef7c6d0405d4b&stat=instructions:u) |
| 43 | + |
| 44 | +| (instructions:u) | mean | range | count | |
| 45 | +|:----------------------------------:|:----:|:------------:|:-----:| |
| 46 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 47 | +| Regressions ❌ <br /> (secondary) | 3.8% | [3.6%, 4.1%] | 6 | |
| 48 | +| Improvements ✅ <br /> (primary) | - | - | 0 | |
| 49 | +| Improvements ✅ <br /> (secondary) | - | - | 0 | |
| 50 | +| All ❌✅ (primary) | - | - | 0 | |
| 51 | + |
| 52 | +* This impacted the 6 variants of match-stress {incr-full,full} x {check,debug,opt} |
| 53 | +* I think the impact on match-stress was probably well-anticipated, and within a reasonable range for a stress-test benchmark. |
| 54 | +* Note that #119688 was a precursor to some further cleanup code (namely to remove the use of a local-arena within exhaustiveness checking). |
| 55 | +* Marking as triaged. |
| 56 | + |
| 57 | +never patterns: Check bindings wrt never patterns [#119610](https://github.com/rust-lang/rust/pull/119610) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=bfcc027a751595ba290c554f47907eaa3779f798&end=714b29a17ff5fa727c794bbb60bfd335f8e75d42&stat=instructions:u) |
| 58 | + |
| 59 | +| (instructions:u) | mean | range | count | |
| 60 | +|:----------------------------------:|:----:|:------------:|:-----:| |
| 61 | +| Regressions ❌ <br /> (primary) | 0.3% | [0.3%, 0.4%] | 3 | |
| 62 | +| Regressions ❌ <br /> (secondary) | - | - | 0 | |
| 63 | +| Improvements ✅ <br /> (primary) | - | - | 0 | |
| 64 | +| Improvements ✅ <br /> (secondary) | - | - | 0 | |
| 65 | +| All ❌✅ (primary) | 0.3% | [0.3%, 0.4%] | 3 | |
| 66 | + |
| 67 | +* This impacted 3 variants of unicode-normalization-0.1.19: debug incr-unchanged and check {incr-unchanged, incr-patched:println}. |
| 68 | +* Interestingly, during two different try runs, those three variants were found to have improved by similar amounts by this PR. |
| 69 | +* there's some weird interaction between that benchmark and the code paths impacted by this PR, and I do not think its worth investing effort in further investigation. |
| 70 | +* marking as triaged. |
| 71 | + |
| 72 | +#### Improvements |
| 73 | + |
| 74 | +macro_rules: Add an expansion-local cache to span marker [#119693](https://github.com/rust-lang/rust/pull/119693) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=76101eecbe9aa80753664bbe637ad06d1925f315&end=0ee9cfd54db7b5f4be35f026588904500c866196&stat=instructions:u) |
| 75 | + |
| 76 | +| (instructions:u) | mean | range | count | |
| 77 | +|:----------------------------------:|:-----:|:---------------:|:-----:| |
| 78 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 79 | +| Regressions ❌ <br /> (secondary) | - | - | 0 | |
| 80 | +| Improvements ✅ <br /> (primary) | -1.4% | [-20.5%, -0.2%] | 80 | |
| 81 | +| Improvements ✅ <br /> (secondary) | -0.8% | [-1.9%, -0.3%] | 16 | |
| 82 | +| All ❌✅ (primary) | -1.4% | [-20.5%, -0.2%] | 80 | |
| 83 | + |
| 84 | +* the bitmaps changes (-20.5%, -17.9%, -13.1%) are all artifacts of returning to normal after a blip in the previous PR. |
| 85 | + |
| 86 | +A more efficient slice comparison implementation for T: !BytewiseEq [#116846](https://github.com/rust-lang/rust/pull/116846) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=ae9d24de80b00b4158d1a29a212a6b02aeda0e75&end=190f4c96116a3b59b7de4881cfec544be0246d84&stat=instructions:u) |
| 87 | + |
| 88 | +| (instructions:u) | mean | range | count | |
| 89 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 90 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 91 | +| Regressions ❌ <br /> (secondary) | - | - | 0 | |
| 92 | +| Improvements ✅ <br /> (primary) | -0.5% | [-0.9%, -0.2%] | 15 | |
| 93 | +| Improvements ✅ <br /> (secondary) | -0.6% | [-0.6%, -0.6%] | 1 | |
| 94 | +| All ❌✅ (primary) | -0.5% | [-0.9%, -0.2%] | 15 | |
| 95 | + |
| 96 | +* it is too bad that work in PR #100124 stalled. |
| 97 | + |
| 98 | +Remove a large amount of leb128-coded integers [#119791](https://github.com/rust-lang/rust/pull/119791) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=d73bd3fb3ba312f3e6b5af4d56d1161d37b71620&end=68acb393c5d2cff049b41981e35217a7e630f63a&stat=instructions:u) |
| 99 | + |
| 100 | +| (instructions:u) | mean | range | count | |
| 101 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 102 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 103 | +| Regressions ❌ <br /> (secondary) | 1.1% | [1.1%, 1.1%] | 1 | |
| 104 | +| Improvements ✅ <br /> (primary) | -0.3% | [-0.3%, -0.2%] | 5 | |
| 105 | +| Improvements ✅ <br /> (secondary) | -0.3% | [-0.5%, -0.1%] | 12 | |
| 106 | +| All ❌✅ (primary) | -0.3% | [-0.3%, -0.2%] | 5 | |
| 107 | + |
| 108 | +* the 1.1% hit is to deep-vector debug full. It may be transient; the history is pretty up-and-down at the time of this PR, and has settled at a lower level than where it was when this PR landed. |
| 109 | +* in any case, the gains elsewhere, especially bootstrap, outweigh the loss to that one secondary benchmark. (Which ... I guess is what the rustc-perf bot now computes as well, since it categorized this as an Improvement rather than Mixed?) |
| 110 | + |
| 111 | +Exhaustiveness: track overlapping ranges precisely [#119396](https://github.com/rust-lang/rust/pull/119396) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=bfd799f1a5a86d16e6b8caa2857bcb4aac6e0174&end=174e73a3f6df6f96ab453493796e461164dea94a&stat=instructions:u) |
| 112 | + |
| 113 | +| (instructions:u) | mean | range | count | |
| 114 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 115 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 116 | +| Regressions ❌ <br /> (secondary) | - | - | 0 | |
| 117 | +| Improvements ✅ <br /> (primary) | -0.4% | [-1.7%, -0.2%] | 32 | |
| 118 | +| Improvements ✅ <br /> (secondary) | - | - | 0 | |
| 119 | +| All ❌✅ (primary) | -0.4% | [-1.7%, -0.2%] | 32 | |
| 120 | + |
| 121 | + |
| 122 | +Rollup of 6 pull requests [#119889](https://github.com/rust-lang/rust/pull/119889) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=174e73a3f6df6f96ab453493796e461164dea94a&end=ce1f2ccf5a5ac9343623bd115a05e4151d93af0d&stat=instructions:u) |
| 123 | + |
| 124 | +| (instructions:u) | mean | range | count | |
| 125 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 126 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 127 | +| Regressions ❌ <br /> (secondary) | - | - | 0 | |
| 128 | +| Improvements ✅ <br /> (primary) | -1.8% | [-1.9%, -1.7%] | 4 | |
| 129 | +| Improvements ✅ <br /> (secondary) | -4.3% | [-7.4%, -1.3%] | 2 | |
| 130 | +| All ❌✅ (primary) | -1.8% | [-1.9%, -1.7%] | 4 | |
| 131 | + |
| 132 | + |
| 133 | +#### Mixed |
| 134 | + |
| 135 | +Support async recursive calls (as long as they have indirection) [#117703](https://github.com/rust-lang/rust/pull/117703) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=387e7a5e42ac074e79a14361e82702a229a6aac8&end=dc641039d2b3f5c0894694e4b45f7c3951030685&stat=instructions:u) |
| 136 | + |
| 137 | +| (instructions:u) | mean | range | count | |
| 138 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 139 | +| Regressions ❌ <br /> (primary) | - | - | 0 | |
| 140 | +| Regressions ❌ <br /> (secondary) | 0.2% | [0.2%, 0.2%] | 1 | |
| 141 | +| Improvements ✅ <br /> (primary) | -0.3% | [-0.4%, -0.3%] | 3 | |
| 142 | +| Improvements ✅ <br /> (secondary) | - | - | 0 | |
| 143 | +| All ❌✅ (primary) | -0.3% | [-0.4%, -0.3%] | 3 | |
| 144 | + |
| 145 | +* this is weird, it looks like an inverse blip occurred on the preceding PR, where tt-muncher check incr-unchanged had a single point with -0.2% instruction-count, and then it preceding to "return to normal" on the succeeding PRs. |
| 146 | +* (Its harder for me to explain away "inverse blips" ...) |
| 147 | +* but at the same time, this does not seem like a significant regression by our usual metrics. |
| 148 | +* marking as triaged. |
| 149 | + |
| 150 | +Rollup of 9 pull requests [#119767](https://github.com/rust-lang/rust/pull/119767) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=be00c5a9b89161b7f45ba80340f709e8e41122f9&end=5876c8cdfd3df742c334d6447d44d760c77103b6&stat=instructions:u) |
| 151 | + |
| 152 | +| (instructions:u) | mean | range | count | |
| 153 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 154 | +| Regressions ❌ <br /> (primary) | 1.3% | [0.4%, 2.3%] | 2 | |
| 155 | +| Regressions ❌ <br /> (secondary) | 0.9% | [0.5%, 1.2%] | 2 | |
| 156 | +| Improvements ✅ <br /> (primary) | - | - | 0 | |
| 157 | +| Improvements ✅ <br /> (secondary) | -0.3% | [-0.3%, -0.3%] | 1 | |
| 158 | +| All ❌✅ (primary) | 1.3% | [0.4%, 2.3%] | 2 | |
| 159 | + |
| 160 | +* primary regressions: syn opt-full regressed by 2.3%, bitmaps check-incr-full by 0.35%. secondary regressions: coercions debug-full by 1.23%, ctfe-stress check-full by 0.51% |
| 161 | +* from the overall history, it seems like syn opt-full returned to "normal" with later PRs that don't necessarily seem like they would have affected syn (e.g. PR #117449). bitmap check-incr-full's trend is likewise downward after this point. |
| 162 | +* marking as triaged |
| 163 | + |
| 164 | +Add assume into `NonZeroIntX::get` [#119452](https://github.com/rust-lang/rust/pull/119452) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=ce1f2ccf5a5ac9343623bd115a05e4151d93af0d&end=2319be8e265dd19973574eb581d28297baf44b11&stat=instructions:u) |
| 165 | + |
| 166 | +| (instructions:u) | mean | range | count | |
| 167 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 168 | +| Regressions ❌ <br /> (primary) | 0.5% | [0.4%, 0.9%] | 4 | |
| 169 | +| Regressions ❌ <br /> (secondary) | - | - | 0 | |
| 170 | +| Improvements ✅ <br /> (primary) | -0.7% | [-0.7%, -0.7%] | 1 | |
| 171 | +| Improvements ✅ <br /> (secondary) | - | - | 0 | |
| 172 | +| All ❌✅ (primary) | 0.3% | [-0.7%, 0.9%] | 5 | |
| 173 | + |
| 174 | +* scottmcm writes: "Instructions have a couple red in instruction counts for opt, but that's entirely reasonable for something intended to enable optimizations. Notably, the cycles are green, with no regressions. So I think this is fine." |
| 175 | +* marking as triaged |
| 176 | + |
| 177 | +Avoid some redundant work in GVN [#119439](https://github.com/rust-lang/rust/pull/119439) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=714b29a17ff5fa727c794bbb60bfd335f8e75d42&end=f9c2421a2a6e34f3756900dd7d600704c08bfccb&stat=instructions:u) |
| 178 | + |
| 179 | +| (instructions:u) | mean | range | count | |
| 180 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 181 | +| Regressions ❌ <br /> (primary) | 0.8% | [0.6%, 0.9%] | 4 | |
| 182 | +| Regressions ❌ <br /> (secondary) | - | - | 0 | |
| 183 | +| Improvements ✅ <br /> (primary) | -0.7% | [-1.1%, -0.5%] | 4 | |
| 184 | +| Improvements ✅ <br /> (secondary) | -0.4% | [-0.4%, -0.4%] | 1 | |
| 185 | +| All ❌✅ (primary) | 0.0% | [-1.1%, 0.9%] | 8 | |
| 186 | + |
| 187 | +* primary regressions are regex-1.5.5 debug-full, opt-incr-patched:Job, incr-full, and exa opt-full. |
| 188 | +* the exa regression looks like a blip. The regex ones were predicted during a try run for the PR. I assume they were deemed acceptable as they are offset improvements elsewhere (or dismissed as noise?) |
| 189 | +* marking as triaged. |
| 190 | + |
| 191 | +Sandwich MIR optimizations between DSE. [#119672](https://github.com/rust-lang/rust/pull/119672) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=f9c2421a2a6e34f3756900dd7d600704c08bfccb&end=fa0dc208d0a34027c1d3cca7d47975d8238bcfde&stat=instructions:u) |
| 192 | + |
| 193 | +| (instructions:u) | mean | range | count | |
| 194 | +|:----------------------------------:|:-----:|:--------------:|:-----:| |
| 195 | +| Regressions ❌ <br /> (primary) | 0.7% | [0.2%, 1.4%] | 14 | |
| 196 | +| Regressions ❌ <br /> (secondary) | 0.5% | [0.2%, 2.7%] | 14 | |
| 197 | +| Improvements ✅ <br /> (primary) | -1.0% | [-2.2%, -0.2%] | 31 | |
| 198 | +| Improvements ✅ <br /> (secondary) | -0.9% | [-2.2%, -0.2%] | 10 | |
| 199 | +| All ❌✅ (primary) | -0.4% | [-2.2%, 1.4%] | 45 | |
| 200 | + |
| 201 | +* already marked as triaged by @lqd with the comment "As seen in the previous runs: some nice wins on bigger benchmarks, and overall gains outweigh the few losses." |
| 202 | + |
0 commit comments