Implement more methods for `vec_deque::IntoIter` #106241

Sp00ph · 2022-12-29T01:48:56Z

This implements a couple Iterator methods on vec_deque::IntoIter ((try_)fold, (try_)rfold advance_(back_)by, next_chunk, count and last) to allow these to be more efficient than their default implementations, also allowing many other Iterator methods that use these under the hood to take advantage of these manual implementations. vec::IntoIter has similar implementations for many of these methods. This PR does not yet implement TrustedRandomAccess and friends, as I'm not very familiar with the required safety guarantees.

r? @the8472 (since you also took over my last PR)

rustbot · 2022-12-29T01:49:04Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

the8472

Have you benchmarked whether they're better than the default impls? If yes please post before/after results.
Vec::IntoIter overrides those methods because it often results in vectorizable code while the default impls don't.
E.g. I suspect next_chunk might benefit less than Vec does because it'll see non-const-length copies even on the Ok branch while Vec only does that for Err.

library/alloc/src/collections/vec_deque/into_iter.rs

Sp00ph · 2023-01-03T12:11:31Z

Have you benchmarked whether they're better than the default impls? If yes please post before/after results. Vec::IntoIter overrides those methods because it often results in vectorizable code while the default impls don't. E.g. I suspect next_chunk might benefit less than Vec does because it'll see non-const-length copies even on the Ok branch while Vec only does that for Err.

I don't think there are currently any benchmarks for vec_deque::IntoIter. I can try to write a few of those later.

the8472 · 2023-01-14T11:37:01Z

I can try to write a few of those later.

Yes, please do. If there aren't significant wins we might as well stick with the defaults.

@rustbot author

Sp00ph · 2023-01-17T20:13:39Z

@the8472 I added some benchmarks for vec_deque::IntoIter, these are the numbers:

master:

test vec_deque::bench_into_iter                          ... bench:       2,615 ns/iter (+/- 320)
test vec_deque::bench_into_iter_fold                     ... bench:       1,756 ns/iter (+/- 80)
test vec_deque::bench_into_iter_next_chunk               ... bench:       2,458 ns/iter (+/- 2,441)
test vec_deque::bench_into_iter_try_fold                 ... bench:       1,433 ns/iter (+/- 992)

This PR:

test vec_deque::bench_into_iter                          ... bench:       2,560 ns/iter (+/- 161)
test vec_deque::bench_into_iter_fold                     ... bench:       1,751 ns/iter (+/- 45)
test vec_deque::bench_into_iter_next_chunk               ... bench:         546 ns/iter (+/- 17)
test vec_deque::bench_into_iter_try_fold                 ... bench:         640 ns/iter (+/- 30)

Seems like manually implementing fold didn't do that much in this case, but next_chunk and try_fold are significant.

Sp00ph · 2023-01-17T20:14:23Z

@rustbot ready

the8472 · 2023-01-17T20:33:14Z

Hrrm, I don't think this makes sense. fold shouldn't be slower than try_fold. The issue is that fold takes ownership of the iterator and drops it, so you can't reuse the allocation when benchmarking IntoIter, you can only do that with borrowing iterators. So it must be using the impl Iterator for &mut Iterator indirection which doesn't optimize fold

In that case we just have to bite the bullet and measure the allocation costs together with the iterator performance.

Sp00ph · 2023-01-17T20:43:32Z

Right, that makes sense. Makes me think, would it maybe make sense to implement the default <&mut Iterator>::fold in terms of Iterator::try_fold? That way it couldn't take advantage of all fold optimizations (try_fold is still unstable), but it would at least work for most stdlib iterators that override try_fold.

the8472 · 2023-01-17T20:51:20Z

There have been several attempts to do that. In fact there currently is an open PR to try that again.

Sp00ph · 2023-01-17T23:34:37Z

I changed the fold benchmark to use a new deque on every iteration. Now the numbers look like this:

master:

test vec_deque::bench_into_iter_fold                     ... bench:       1,979 ns/iter (+/- 160)

This PR:

test vec_deque::bench_into_iter_fold                     ... bench:         395 ns/iter (+/- 16)

Sp00ph · 2023-02-05T01:17:59Z

@the8472 friendly reminder :)

the8472 · 2023-02-18T18:05:36Z

@bors r+ rollup=never

bors · 2023-02-18T18:05:38Z

📌 Commit ccba6c5 has been approved by the8472

It is now in the queue for this repository.

bors · 2023-02-18T20:12:57Z

⌛ Testing commit ccba6c5 with merge 4507fda...

bors · 2023-02-18T23:31:06Z

☀️ Test successful - checks-actions
Approved by: the8472
Pushing 4507fda to master...

bors · 2023-02-18T23:31:06Z

☀️ Test successful - checks-actions
Approved by: the8472
Pushing 4507fda to master...

rust-timer · 2023-02-19T00:37:25Z

Finished benchmarking commit (4507fda): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-1.1%	[-1.1%, -1.1%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.1%	[3.1%, 3.1%]	1
Regressions ❌ (secondary)	2.5%	[2.5%, 2.5%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-1.1%	[-1.1%, -1.1%]	1
All ❌✅ (primary)	3.1%	[3.1%, 3.1%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

rustbot assigned the8472 Dec 29, 2022

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 29, 2022

Implement more methods for vec_deque::IntoIter

7c13b14

Sp00ph force-pushed the vec_deque_iter_methods branch from 1d7c644 to 7c13b14 Compare December 29, 2022 02:00

the8472 reviewed Jan 3, 2023

View reviewed changes

library/alloc/src/collections/vec_deque/into_iter.rs Show resolved Hide resolved

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 14, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 17, 2023

Add vec_deque::IntoIter benchmarks

ccba6c5

Sp00ph force-pushed the vec_deque_iter_methods branch from 461d0f7 to ccba6c5 Compare January 17, 2023 23:34

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 18, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 18, 2023

bors merged commit 4507fda into rust-lang:master Feb 18, 2023

rustbot added this to the 1.69.0 milestone Feb 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement more methods for `vec_deque::IntoIter` #106241

Implement more methods for `vec_deque::IntoIter` #106241

Sp00ph commented Dec 29, 2022

rustbot commented Dec 29, 2022

the8472 left a comment

Sp00ph commented Jan 3, 2023

the8472 commented Jan 14, 2023

Sp00ph commented Jan 17, 2023

Sp00ph commented Jan 17, 2023

the8472 commented Jan 17, 2023 •

edited

Loading

Sp00ph commented Jan 17, 2023

the8472 commented Jan 17, 2023

Sp00ph commented Jan 17, 2023

Sp00ph commented Feb 5, 2023

the8472 commented Feb 18, 2023

bors commented Feb 18, 2023

bors commented Feb 18, 2023

bors commented Feb 18, 2023

bors commented Feb 18, 2023

rust-timer commented Feb 19, 2023

Implement more methods for vec_deque::IntoIter #106241

Implement more methods for vec_deque::IntoIter #106241

Conversation

Sp00ph commented Dec 29, 2022

rustbot commented Dec 29, 2022

the8472 left a comment

Choose a reason for hiding this comment

Sp00ph commented Jan 3, 2023

the8472 commented Jan 14, 2023

Sp00ph commented Jan 17, 2023

Sp00ph commented Jan 17, 2023

the8472 commented Jan 17, 2023 • edited Loading

Sp00ph commented Jan 17, 2023

the8472 commented Jan 17, 2023

Sp00ph commented Jan 17, 2023

Sp00ph commented Feb 5, 2023

the8472 commented Feb 18, 2023

bors commented Feb 18, 2023

bors commented Feb 18, 2023

bors commented Feb 18, 2023

bors commented Feb 18, 2023

rust-timer commented Feb 19, 2023

Overall result: ✅ improvements - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Implement more methods for `vec_deque::IntoIter` #106241

Implement more methods for `vec_deque::IntoIter` #106241

the8472 commented Jan 17, 2023 •

edited

Loading