Skip to content

Add a short-circuiting path to slice comparison #113576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

krtab
Copy link
Contributor

@krtab krtab commented Jul 11, 2023

This adds a short-circuiting path to slice comparison using the fact that two slices of the same length and pointing at the same address are equal.

Ideally this would be implemented for both:

  • comparisons where A and B can be compared byte-wise (done)
  • comparisons where A is B (cannot be done currently due to limitation in trait specialization) edit: this is false, see float for example

@rustbot
Copy link
Collaborator

rustbot commented Jul 11, 2023

r? @m-ou-se

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 11, 2023
@krtab
Copy link
Contributor Author

krtab commented Jul 11, 2023

Given that this PR is made in hope of improving performance, I guess performance benchmarks would be appreciable, but I don't think I can start those myself.

@workingjubilee
Copy link
Member

@bors try
@rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 11, 2023
@bors
Copy link
Collaborator

bors commented Jul 11, 2023

⌛ Trying commit 1616ffd67536e6201f8cda5fce4da6441fa750d6 with merge 8fcfeecab188ad3fc76fa0e22b6015e3007d8caa...

@bors
Copy link
Collaborator

bors commented Jul 11, 2023

☀️ Try build successful - checks-actions
Build commit: 8fcfeecab188ad3fc76fa0e22b6015e3007d8caa (8fcfeecab188ad3fc76fa0e22b6015e3007d8caa)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8fcfeecab188ad3fc76fa0e22b6015e3007d8caa): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.8% [0.4%, 1.1%] 6
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.6% [-0.6%, -0.6%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.6% [-0.6%, 1.1%] 7

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.8% [3.8%, 3.8%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.0% [-3.0%, -3.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [-3.0%, 3.8%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [0.9%, 1.2%] 4
Regressions ❌
(secondary)
2.3% [1.0%, 3.5%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.1% [0.9%, 1.2%] 4

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.2%] 16
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.1% [-0.1%, -0.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.1%, 0.2%] 17

Bootstrap: 656.645s -> 656.898s (0.04%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 12, 2023
@workingjubilee
Copy link
Member

It's counterintuitive as hell, but I can't say I'm that surprised.

"You're comparing something with itself, that's gonna be true" is an easy evaluation for a compiler, but harder to optimize out for the "you're not comparing something with itself" case (since then the problem is usually greater uncertainty overall), so it's dead instructions on all the paths that actually need to run the full memcmp.

@m-ou-se m-ou-se added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 12, 2023
@krtab krtab force-pushed the addr_slice_comp branch from 1616ffd to b0ca5ff Compare July 12, 2023 14:34
@krtab
Copy link
Contributor Author

krtab commented Jul 12, 2023

Yeah this is indeed not so surprising as comparing the exact same slice is pretty rare.

I've force-pushed a new version that generates branch-less assembly using conditional moves, which may be better, but I have little hope.

This adds a short-circuiting path to slice comparison using the fact that two slices of the same length and pointing at the same address are equal.
@krtab
Copy link
Contributor Author

krtab commented Jul 12, 2023

I'll try to see if it let me do it myself, otherwise I'll need you to do it :)

@bors try
@rust-timer queue

edit: It did not.

@bors
Copy link
Collaborator

bors commented Jul 12, 2023

@krtab: 🔑 Insufficient privileges: not in try users

@rust-timer

This comment has been minimized.

1 similar comment
@rust-timer

This comment has been minimized.

@lqd
Copy link
Member

lqd commented Jul 12, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2023
@bors
Copy link
Collaborator

bors commented Jul 12, 2023

⌛ Trying commit b0ca5ff with merge ef9db8293d0b16a1e6a3d43fbee50ecbeb0b4290...

@bors
Copy link
Collaborator

bors commented Jul 12, 2023

☀️ Try build successful - checks-actions
Build commit: ef9db8293d0b16a1e6a3d43fbee50ecbeb0b4290 (ef9db8293d0b16a1e6a3d43fbee50ecbeb0b4290)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ef9db8293d0b16a1e6a3d43fbee50ecbeb0b4290): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.4%, 0.6%] 8
Regressions ❌
(secondary)
0.6% [0.5%, 0.6%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [0.4%, 0.6%] 8

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.2% [4.2%, 4.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.9%, 0.9%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.9% [0.9%, 0.9%] 1

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.1%, 0.3%] 16
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.1% [-0.1%, -0.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.1%, 0.3%] 17

Bootstrap: 657.774s -> 658.13s (0.05%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2023
@krtab
Copy link
Contributor Author

krtab commented Jul 13, 2023

Well I guess that's all I could try. I'm not sure how representative of "the typical workload" these benchmarks are but they are not that surprising either so feel free to close this if you don't think this is mergeable as is @workingjubilee, and thanks for the review.

@workingjubilee
Copy link
Member

rustc's workload is heavily dominated by "branchy" code and it does exercise a lot of Vecs, but it doesn't do as much work on slice comparison I think. On the other hand it doesn't not: this is not like a floating-point benchmark where the compiler is completely useless. So while we may want to reexamine this question when we get a better perf suite which has more actual programs to test, instead of just the compiler's own benchmarks, it's a pretty bad sign without further motivation. So yeah, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants