Add a cache for `maybe_lint_level_root_bounded` #113609

nnethercote · 2023-07-12T04:29:23Z

maybe_lint_level_root_bounded is called many times and traces node sub-paths many times. This PR adds a cache that lets many of these tracings be skipped, avoiding lots of calls to functions like Map::attrs and Map::parent_id.

r? @cjgillot

It's annoying that these wrap in a 100-char terminal window.

From `TyCtxt` to the MIR `Builder`. This will allow us to add a cache to `Builder` and use it from `maybe_lint_level_root_bounded`.

nnethercote · 2023-07-12T04:29:34Z

@bors try @rust-timer queue

bors · 2023-07-12T04:29:44Z

⌛ Trying commit 812a5ee0f64a6566cb35fd71e27fe01d8139bd27 with merge 14a7775ca133dba23c6852ebdc8638048b5a57da...

bors · 2023-07-12T05:46:00Z

☀️ Try build successful - checks-actions
Build commit: 14a7775ca133dba23c6852ebdc8638048b5a57da (14a7775ca133dba23c6852ebdc8638048b5a57da)

rust-timer · 2023-07-12T08:27:21Z

Finished benchmarking commit (14a7775ca133dba23c6852ebdc8638048b5a57da): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.5%	[-3.4%, -0.3%]	34
Improvements ✅ (secondary)	-1.8%	[-5.7%, -0.3%]	33
All ❌✅ (primary)	-1.5%	[-3.4%, -0.3%]	34

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.8%	[3.1%, 5.2%]	4
Improvements ✅ (primary)	-1.0%	[-1.0%, -1.0%]	1
Improvements ✅ (secondary)	-2.4%	[-3.0%, -1.4%]	4
All ❌✅ (primary)	-1.0%	[-1.0%, -1.0%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.5%	[2.5%, 2.5%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 656.821s -> 659.49s (0.41%)

cjgillot

Great results!
IIRC, we always have orig_id.owner equal to self.hir_id.owner, so we the cache could be limited to storing hir::ItemLocalId.
Do you have an idea of the sparsity of the cache? I wonder if we could get even better with a bitset instead of a hashset.

cjgillot · 2023-07-12T11:07:19Z

compiler/rustc_mir_build/src/build/mod.rs

@@ -725,6 +733,7 @@ impl<'a, 'tcx> Builder<'a, 'tcx> {
            var_indices: Default::default(),
            unit_temp: None,
            var_debug_info: vec![],
+            lint_level_roots_cache: FxHashSet::default(),


Should we initialize with self.hir_id?

No. Due to the parent_id == self.hir_id test on the second call, self.hir_id is never passed to maybe_lint_level_root_bounded :)

cjgillot · 2023-07-12T11:08:08Z

compiler/rustc_mir_build/src/build/scope.rs

+                    if parent_id == self.hir_id {
+                        parent_id // this is very common
+                    } else {


Should this be made a fast path inside maybe_lint_level_root_bounded?

id == self.hir_id is already the first check within maybe_lint_level_root_bounded. I added this pre-check here because (a) it's useful documentation, and (b) it gave a 0.4% instruction count win for deep-vector, due to avoiding the function call overhead.

cjgillot · 2023-07-12T11:12:46Z

compiler/rustc_mir_build/src/build/scope.rs

+            if hir.attrs(id).iter().any(|attr| Level::from_attr(attr).is_some()) {
+                // This is a rare case. It's for a node path that doesn't reach the root due to an
+                // intervening lint level attribute. This result doesn't get cached.
+                return id;


Should we eventually cache this too? If we have a whole HIR subtree that hits the same lint root, different than self.hir_id?

I originally did cache these values, using an FxHashMap<HirId, HirId>. Then I realized that 99% of the values stored were self.hir_id, which seemed wasteful. So I tried changing it to FxHashSet<HirId> and only caching those 99% and the instruction count dropped very slightly, while also using less memory. And if we want to use bitset (like you suggested above) then we'll need to keep this design.

nnethercote · 2023-07-12T23:31:12Z

IIRC, we always have orig_id.owner equal to self.hir_id.owner, so we the cache could be limited to storing hir::ItemLocalId. Do you have an idea of the sparsity of the cache? I wonder if we could get even better with a bitset instead of a hashset.

Oh, cool! I wondered about a bitset but I thought it wasn't possible because of HirId having two fields. I didn't realize the owner fields were always the same. The density of the cache values is really high, and I have now switched to a bitset. It was a small additional perf improvement (much smaller than the original improvement) but I'm happy to be using a more efficient data structure.

It's a nice speed win.

nnethercote · 2023-07-12T23:51:57Z

The new version uses a bitset instead of a hashset.

@bors try @rust-timer queue

bors · 2023-07-12T23:52:10Z

⌛ Trying commit 667d75e with merge 948981694e24fd6ad761c41a383843fbe8b5dad1...

bors · 2023-07-13T01:09:42Z

☀️ Try build successful - checks-actions
Build commit: 948981694e24fd6ad761c41a383843fbe8b5dad1 (948981694e24fd6ad761c41a383843fbe8b5dad1)

bors · 2023-07-13T01:09:42Z

☀️ Try build successful - checks-actions
Build commit: 948981694e24fd6ad761c41a383843fbe8b5dad1 (948981694e24fd6ad761c41a383843fbe8b5dad1)

rust-timer · 2023-07-13T03:28:15Z

Finished benchmarking commit (948981694e24fd6ad761c41a383843fbe8b5dad1): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.2%	[-3.5%, -0.3%]	57
Improvements ✅ (secondary)	-1.9%	[-6.0%, -0.3%]	35
All ❌✅ (primary)	-1.2%	[-3.5%, -0.3%]	57

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.6%	[3.6%, 3.6%]	1
Regressions ❌ (secondary)	3.4%	[1.3%, 5.2%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.7%	[-3.7%, -3.7%]	1
All ❌✅ (primary)	3.6%	[3.6%, 3.6%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.5%	[-5.3%, -1.2%]	8
Improvements ✅ (secondary)	-3.7%	[-4.6%, -2.5%]	5
All ❌✅ (primary)	-3.5%	[-5.3%, -1.2%]	8

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 658.224s -> 659.878s (0.25%)

nnethercote · 2023-07-13T04:04:51Z

Yay, the new results with the bitset are clearly better: improvement in instruction counts, but even better, cycles and walltimes are seeing some genuine clear wins.

cjgillot · 2023-07-13T09:38:42Z

Thanks!
@bors r+

bors · 2023-07-13T09:38:44Z

📌 Commit 667d75e has been approved by cjgillot

It is now in the queue for this repository.

bors · 2023-07-14T05:30:57Z

⌛ Testing commit 667d75e with merge fe03b46...

bors · 2023-07-14T07:19:51Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing fe03b46 to master...

bors · 2023-07-14T07:19:51Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing fe03b46 to master...

rust-timer · 2023-07-14T09:46:34Z

Finished benchmarking commit (fe03b46): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.4%	[-3.4%, -0.5%]	29
Improvements ✅ (secondary)	-1.9%	[-5.9%, -0.2%]	33
All ❌✅ (primary)	-1.4%	[-3.4%, -0.5%]	29

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.1%	[1.3%, 4.8%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.7%	[-5.5%, -2.6%]	6
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 656.192s -> 658.307s (0.32%)

nnethercote added 2 commits July 12, 2023 09:16

Shorten some overlong comment lines.

3645810

It's annoying that these wrap in a 100-char terminal window.

Move maybe_lint_level_root_bounded.

f234dc3

From `TyCtxt` to the MIR `Builder`. This will allow us to add a cache to `Builder` and use it from `maybe_lint_level_root_bounded`.

rustbot assigned cjgillot Jul 12, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 12, 2023

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2023

cjgillot reviewed Jul 12, 2023

View reviewed changes

Add a cache for maybe_lint_level_root_bounded.

667d75e

It's a nice speed win.

nnethercote force-pushed the maybe_lint_level_root_bounded-cache branch from 812a5ee to 667d75e Compare July 12, 2023 23:50

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 13, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 13, 2023

bors added merged-by-bors This PR was explicitly merged by bors. labels Jul 14, 2023

bors merged commit fe03b46 into rust-lang:master Jul 14, 2023

rustbot added this to the 1.73.0 milestone Jul 14, 2023

nnethercote deleted the maybe_lint_level_root_bounded-cache branch July 14, 2023 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a cache for `maybe_lint_level_root_bounded` #113609

Add a cache for `maybe_lint_level_root_bounded` #113609

nnethercote commented Jul 12, 2023

nnethercote commented Jul 12, 2023

This comment has been minimized.

bors commented Jul 12, 2023

bors commented Jul 12, 2023

This comment has been minimized.

rust-timer commented Jul 12, 2023

cjgillot left a comment

cjgillot Jul 12, 2023

nnethercote Jul 12, 2023

cjgillot Jul 12, 2023

nnethercote Jul 12, 2023

cjgillot Jul 12, 2023

nnethercote Jul 12, 2023

nnethercote commented Jul 12, 2023

nnethercote commented Jul 12, 2023

This comment has been minimized.

bors commented Jul 12, 2023

bors commented Jul 13, 2023

bors commented Jul 13, 2023

This comment has been minimized.

rust-timer commented Jul 13, 2023

nnethercote commented Jul 13, 2023

cjgillot commented Jul 13, 2023

bors commented Jul 13, 2023

bors commented Jul 14, 2023

bors commented Jul 14, 2023

bors commented Jul 14, 2023

rust-timer commented Jul 14, 2023

Add a cache for maybe_lint_level_root_bounded #113609

Add a cache for maybe_lint_level_root_bounded #113609

Conversation

nnethercote commented Jul 12, 2023

nnethercote commented Jul 12, 2023

This comment has been minimized.

bors commented Jul 12, 2023

bors commented Jul 12, 2023

This comment has been minimized.

rust-timer commented Jul 12, 2023

Overall result: ✅ improvements - no action needed

cjgillot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nnethercote commented Jul 12, 2023

nnethercote commented Jul 12, 2023

This comment has been minimized.

bors commented Jul 12, 2023

bors commented Jul 13, 2023

bors commented Jul 13, 2023

This comment has been minimized.

rust-timer commented Jul 13, 2023

Overall result: ✅ improvements - no action needed

nnethercote commented Jul 13, 2023

cjgillot commented Jul 13, 2023

bors commented Jul 13, 2023

bors commented Jul 14, 2023

bors commented Jul 14, 2023

bors commented Jul 14, 2023

rust-timer commented Jul 14, 2023

Overall result: ✅ improvements - no action needed

Add a cache for `maybe_lint_level_root_bounded` #113609

Add a cache for `maybe_lint_level_root_bounded` #113609