Compile `unicode-normalization` faster #97936

nnethercote · 2022-06-10T02:46:42Z

Various optimizations and cleanups aimed at improving compilation of unicode-normalization, which is notable for having several very large matches with many char ranges.

Best reviewed one commit at a time.

r? @oli-obk

nnethercote · 2022-06-10T02:46:57Z

@bors try @rust-timer queue

rust-timer · 2022-06-10T02:46:59Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-06-10T02:47:05Z

⌛ Trying commit ca5e9908fdecd0a6f37dccb54fb0b3e8c416982b with merge dfb5ac9ecb0814d8c8d7f4ea4c0ee061170bdf97...

bors · 2022-06-10T04:55:34Z

☀️ Try build successful - checks-actions
Build commit: dfb5ac9ecb0814d8c8d7f4ea4c0ee061170bdf97 (dfb5ac9ecb0814d8c8d7f4ea4c0ee061170bdf97)

rust-timer · 2022-06-10T04:55:36Z

Queued dfb5ac9ecb0814d8c8d7f4ea4c0ee061170bdf97 with parent 420c970, future comparison URL.

rust-timer · 2022-06-10T09:06:59Z

Finished benchmarking commit (dfb5ac9ecb0814d8c8d7f4ea4c0ee061170bdf97): comparison url.

Instruction count

Primary benchmarks: 🎉 relevant improvements found
Secondary benchmarks: no relevant changes found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-14.3%	-20.0%	6
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-14.3%	-20.0%	6

Max RSS (memory usage)

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	3.2%	3.2%	1
Improvements 🎉 (primary)	-3.7%	-3.7%	1
Improvements 🎉 (secondary)	-3.0%	-3.4%	2
All 😿🎉 (primary)	-3.7%	-3.7%	1

Cycles

Results

Primary benchmarks: 🎉 relevant improvements found
Secondary benchmarks: 😿 relevant regressions found

	mean¹	max	count²
Regressions 😿 (primary)	2.9%	2.9%	1
Regressions 😿 (secondary)	2.6%	3.1%	3
Improvements 🎉 (primary)	-10.4%	-15.3%	6
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-8.5%	-15.3%	7

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

oli-obk · 2022-06-13T09:59:55Z

compiler/rustc_mir_build/src/thir/pattern/mod.rs

+
+    // This code is hot when compiling `unicode-normalization` because it has a
+    // number of matches with many ranges such as '\u{037A}'..='\u{037F}'. So
+    // we special-case comparisons of chars in the relevant form for speed.


I dislike special casing chars just for one benchmark. I think we should generally improve the logic inside the eval_bits and related functions to have a fast path for already evaluated constants.

For something that is actionable in this PR, please just replace this specific type match with one that performs this fast path for everything but floats and signed ints (or possibly pull the large type match below into a separate inline-always function and call it whenever both constants are already evaluated, irrespective of the type.

Creating a fast path like that is tricky, because there are so many layers to these types, and because of the size checking and layout computations that takes place.

The fast path I added to this function hard to integrate with the rest of the function, because the fast path comparison is on ScalarInt values, while the general case comparison is on u128 values. Extracting those u128 values in the fast case would require extra size checks and layout computations, eating into the speed wins. But I have generalized it to work with any scalar type other than Float or Int.

This is a performance win for `unicode-normalization`. Also, I find the new formulation easier to read.

This is a performance win for `unicode-normalization`. The commit also removes the closure, which isn't necessary. And reformulates the comparison into a form I find easier to read.

Also, the `try_to_bits` always succeeds, so use `unwrap`.

It's never executed when running the entire test suite. I think it's because of the early return at the top of the function if `a.ty() != ty` succeeds.

Because they're always equal.

A direct comparison has the same effect. This also avoids the need for a type test within `compare_const_vals`.

The code is clearer and simpler without it. Note that the `a == b` early return at the top of the function means the `a == b` test at the end of the function could never succeed.

It's now only used in no-longer-interesting assertion.

Because these evaluations can never fail.

This commit removes the `a == b` early return, which isn't useful in practice, and replaces it with one that helps matches with many ranges, including char ranges.

nnethercote · 2022-06-16T01:26:30Z

I have updated the final commit as requested.

oli-obk · 2022-06-16T16:19:13Z

@bors r+

bors · 2022-06-16T16:19:15Z

📌 Commit bdbf9b2 has been approved by oli-obk

bors · 2022-06-16T21:09:33Z

⌛ Testing commit bdbf9b2 with merge cacc75c...

bors · 2022-06-16T23:50:15Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing cacc75c to master...

rust-timer · 2022-06-17T02:10:56Z

Finished benchmarking commit (cacc75c): comparison url.

Instruction count

Primary benchmarks: 🎉 relevant improvements found
Secondary benchmarks: 😿 relevant regression found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	1.0%	1.0%	1
Improvements 🎉 (primary)	-14.2%	-19.4%	6
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-14.2%	-19.4%	6

Max RSS (memory usage)

Results

Primary benchmarks: no relevant changes found
Secondary benchmarks: 😿 relevant regression found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	2.3%	2.3%	1
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	N/A	N/A	0

Cycles

Results

Primary benchmarks: 🎉 relevant improvements found
Secondary benchmarks: 🎉 relevant improvement found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-9.1%	-14.4%	7
Improvements 🎉 (secondary)	-1.9%	-1.9%	1
All 😿🎉 (primary)	-9.1%	-14.4%	7

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

nnethercote · 2022-06-17T06:22:29Z

@lqd and I were talking just today about how debug builds of coercions have become very noisy. The change there is not meaningful, and wasn't present in the earlier perf CI run.

@rustbot perf-regression-triaged

rust-highfive assigned oli-obk Jun 10, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 10, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 10, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 10, 2022

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 10, 2022

oli-obk reviewed Jun 13, 2022

View reviewed changes

nnethercote added 10 commits June 16, 2022 10:52

const_range_contains: avoid the second comparison if possible.

7e4ec35

This is a performance win for `unicode-normalization`. Also, I find the new formulation easier to read.

sort_candidates: avoid the second comparison if possible.

c4cd044

This is a performance win for `unicode-normalization`. The commit also removes the closure, which isn't necessary. And reformulates the comparison into a form I find easier to read.

simplify_match_pair: avoid the second comparison if possible.

be6c364

Also, the `try_to_bits` always succeeds, so use `unwrap`.

Remove dead code from compare_const_vals.

d5a13e2

It's never executed when running the entire test suite. I think it's because of the early return at the top of the function if `a.ty() != ty` succeeds.

Assert type equality of a and b in compare_const_vals.

9b4b34a

Because they're always equal.

Remove one use of compare_const_vals.

b67635f

A direct comparison has the same effect. This also avoids the need for a type test within `compare_const_vals`.

Inline and remove fallback closure.

fab85dd

Remove from_bool closure.

3ab6ef1

The code is clearer and simpler without it. Note that the `a == b` early return at the top of the function means the `a == b` test at the end of the function could never succeed.

Remove ty arg from compare_const_vals.

246a5e0

It's now only used in no-longer-interesting assertion.

compare_const_vals: Use infallible evaluation.

73c52b7

Because these evaluations can never fail.

nnethercote force-pushed the compile-unicode_normalization-faster branch from ca5e990 to 32b011a Compare June 16, 2022 01:25

compare_const_vals: add a special case for certain ranges.

bdbf9b2

This commit removes the `a == b` early return, which isn't useful in practice, and replaces it with one that helps matches with many ranges, including char ranges.

nnethercote force-pushed the compile-unicode_normalization-faster branch from 32b011a to bdbf9b2 Compare June 16, 2022 01:26

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 16, 2022

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jun 16, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 16, 2022

bors merged commit cacc75c into rust-lang:master Jun 16, 2022

rustbot added this to the 1.63.0 milestone Jun 16, 2022

bors mentioned this pull request Jun 17, 2022

Try out local var id in thir mirroring #98130

Closed

nnethercote deleted the compile-unicode_normalization-faster branch June 17, 2022 01:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile `unicode-normalization` faster #97936

Compile `unicode-normalization` faster #97936

nnethercote commented Jun 10, 2022

nnethercote commented Jun 10, 2022

rust-timer commented Jun 10, 2022

bors commented Jun 10, 2022

bors commented Jun 10, 2022

rust-timer commented Jun 10, 2022

rust-timer commented Jun 10, 2022

oli-obk Jun 13, 2022

nnethercote Jun 16, 2022

nnethercote commented Jun 16, 2022

oli-obk commented Jun 16, 2022

bors commented Jun 16, 2022

bors commented Jun 16, 2022

bors commented Jun 16, 2022

rust-timer commented Jun 17, 2022

nnethercote commented Jun 17, 2022

Compile unicode-normalization faster #97936

Compile unicode-normalization faster #97936

Conversation

nnethercote commented Jun 10, 2022

nnethercote commented Jun 10, 2022

rust-timer commented Jun 10, 2022

bors commented Jun 10, 2022

bors commented Jun 10, 2022

rust-timer commented Jun 10, 2022

rust-timer commented Jun 10, 2022

Footnotes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nnethercote commented Jun 16, 2022

oli-obk commented Jun 16, 2022

bors commented Jun 16, 2022

bors commented Jun 16, 2022

bors commented Jun 16, 2022

rust-timer commented Jun 17, 2022

Footnotes

nnethercote commented Jun 17, 2022

Compile `unicode-normalization` faster #97936

Compile `unicode-normalization` faster #97936