Improve intrinsic handling in cg_ssa (part 2) #141760

bjorn3 · 2025-05-30T10:08:06Z

Avoid computing function type and signature for intrinsics where possible
Nicer handling of bool returning intrinsics

Follow up to #141404

rustbot · 2025-05-30T10:08:11Z

rustbot has assigned @eholk.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-05-30T10:08:13Z

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

This avoids having to get the function signature.

eholk · 2025-05-31T01:48:07Z

This seems okay but I don't know this part of the compiler well so someone else should probably review it.

@bors r? compiler

Nadrieril · 2025-05-31T10:22:09Z

Same same

r? compiler

fee1-dead · 2025-05-31T13:15:39Z

@bors try @rust-timer queue

Improve intrinsic handling in cg_ssa (part 2) * Avoid computing function type and signature for intrinsics where possible * Nicer handling of bool returning intrinsics Follow up to #141404

bors · 2025-05-31T13:16:50Z

⌛ Trying commit 284bec5 with merge 97a633f...

bors · 2025-05-31T15:38:49Z

☀️ Try build successful - checks-actions
Build commit: 97a633f (97a633f1cd554a63d2f7fe65fa6a31d52bf6345d)

rust-timer · 2025-05-31T18:48:41Z

Finished benchmarking commit (97a633f): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.5%	[-0.5%, -0.5%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -4.2%, secondary -2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.0%	[0.8%, 1.3%]	2
Improvements ✅ (primary)	-4.2%	[-4.3%, -4.2%]	4
Improvements ✅ (secondary)	-2.5%	[-4.4%, -0.5%]	51
All ❌✅ (primary)	-4.2%	[-4.3%, -4.2%]	4

Cycles

Results (primary 2.1%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.1%	[2.1%, 2.1%]	1
Regressions ❌ (secondary)	0.8%	[0.4%, 1.4%]	10
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.7%	[-1.2%, -0.4%]	8
All ❌✅ (primary)	2.1%	[2.1%, 2.1%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 777.196s -> 776.173s (-0.13%)
Artifact size: 370.47 MiB -> 370.46 MiB (-0.00%)

bjorn3 · 2025-06-01T09:09:01Z

@bors r=fee1-dead

bors · 2025-06-01T09:09:03Z

📌 Commit 284bec5 has been approved by fee1-dead

It is now in the queue for this repository.

bors · 2025-06-02T00:57:48Z

⌛ Testing commit 284bec5 with merge 2fc3dee...

bors · 2025-06-02T04:17:08Z

☀️ Test successful - checks-actions
Approved by: fee1-dead
Pushing 2fc3dee to master...

github-actions · 2025-06-02T04:20:24Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 99e7c15 (parent) -> 2fc3dee (this PR)

Test differences

No test diffs found

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 2fc3deed9fcb8762ad57191e0195f06f7543e4a5 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

x86_64-apple-2: 4449.9s -> 6503.0s (46.1%)
x86_64-apple-1: 6953.5s -> 9026.9s (29.8%)
aarch64-apple: 4157.9s -> 5302.8s (27.5%)
dist-apple-various: 5854.3s -> 6971.0s (19.1%)
dist-x86_64-apple: 8488.2s -> 9928.6s (17.0%)
i686-gnu-1: 8440.0s -> 9005.3s (6.7%)
x86_64-msvc-ext1: 7585.2s -> 7158.1s (-5.6%)
x86_64-gnu-llvm-20-1: 4032.8s -> 3806.1s (-5.6%)
x86_64-gnu-llvm-19-3: 7251.9s -> 7655.9s (5.6%)
x86_64-gnu-llvm-19-2: 6284.5s -> 5980.2s (-4.8%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-06-02T05:48:30Z

Finished benchmarking commit (2fc3dee): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	2.9%	[2.9%, 2.9%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.3%, -0.3%]	1
All ❌✅ (primary)	2.9%	[2.9%, 2.9%]	1

Max RSS (memory usage)

Results (secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.9%	[0.5%, 2.0%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.7%	[-1.3%, -0.4%]	6
All ❌✅ (primary)	-	-	0

Cycles

Results (primary 2.0%, secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.0%	[0.6%, 3.4%]	2
Regressions ❌ (secondary)	0.8%	[0.5%, 1.5%]	10
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.9%	[-1.8%, -0.5%]	10
All ❌✅ (primary)	2.0%	[0.6%, 3.4%]	2

Binary size

Results (primary 1.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.1%	[1.1%, 1.1%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.1%	[1.1%, 1.1%]	1

Bootstrap: 774.28s -> 775.84s (0.20%)
Artifact size: 372.24 MiB -> 372.26 MiB (0.00%)

rustbot assigned eholk May 30, 2025

bjorn3 added 4 commits May 30, 2025 10:12

Use layout field of OperandRef and PlaceRef in codegen_intrinsic_call

1f717ae

This avoids having to get the function signature.

Use layout field of OperandRef in generic_simd_intrinsic

38a6dae

Avoid computing function type for intrinsic instances

0fcea3d

Directly use from_immediate for handling bool

284bec5

bjorn3 force-pushed the intrinsic_rework_part2 branch from 2342362 to 284bec5 Compare May 30, 2025 10:13

rustbot assigned Nadrieril and unassigned eholk May 31, 2025

rustbot assigned fee1-dead and unassigned Nadrieril May 31, 2025

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 31, 2025

fee1-dead approved these changes May 31, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 31, 2025

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 1, 2025

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jun 1, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 2, 2025

bors merged commit 2fc3dee into rust-lang:master Jun 2, 2025
10 checks passed

rustbot added this to the 1.89.0 milestone Jun 2, 2025

rustbot added the perf-regression Performance regression. label Jun 2, 2025

bjorn3 deleted the intrinsic_rework_part2 branch June 2, 2025 08:09

Mark-Simulacrum removed the perf-regression Performance regression. label Jun 2, 2025

Improve intrinsic handling in cg_ssa (part 2) #141760

Improve intrinsic handling in cg_ssa (part 2) #141760

Uh oh!

Conversation

bjorn3 commented May 30, 2025

Uh oh!

rustbot commented May 30, 2025

Uh oh!

rustbot commented May 30, 2025

Uh oh!

eholk commented May 31, 2025

Uh oh!

Nadrieril commented May 31, 2025

Uh oh!

fee1-dead commented May 31, 2025

Uh oh!

This comment has been minimized.

bors commented May 31, 2025

Uh oh!

bors commented May 31, 2025

Uh oh!

This comment has been minimized.

rust-timer commented May 31, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

bjorn3 commented Jun 1, 2025

Uh oh!

bors commented Jun 1, 2025

Uh oh!

bors commented Jun 2, 2025

Uh oh!

bors commented Jun 2, 2025

Uh oh!

Uh oh!

github-actions bot commented Jun 2, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented Jun 2, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

Uh oh!