Skip to content

Reserve before write_fmt for owned buffers #137762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

thaliaarchi
Copy link
Contributor

@thaliaarchi thaliaarchi commented Feb 28, 2025

fmt::Arguments::estimated_capacity() is currently only used by fmt::format() to reserve the initial string capacity. Also use it in fmt::Write::write_fmt for String and OsString; and io::Write::write_fmt for Vec<u8>, VecDeque<u8>, Cursor<&mut Vec<u8>>, and Cursor<Vec<u8>>.

This may be worth checking perf.

@rustbot
Copy link
Collaborator

rustbot commented Feb 28, 2025

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 28, 2025
@thaliaarchi
Copy link
Contributor Author

r? @workingjubilee

@rustbot rustbot assigned workingjubilee and unassigned tgross35 Feb 28, 2025
@workingjubilee
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 28, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 28, 2025
…e-fmt, r=<try>

Reserve before `write_fmt`

`fmt::Arguments::estimated_capacity()` is currently only used for `fmt::format()` to reserve the initial string capacity. Also use it for `impl Write` for `Vec<u8>`, `VecDeque<u8>`, `Cursor<&mut Vec<u8>>`, and `Cursor<Vec<u8>>`.

This may be worth checking perf.
@bors
Copy link
Collaborator

bors commented Feb 28, 2025

⌛ Trying commit 0e82180 with merge 9cdfd5d...

@bors
Copy link
Collaborator

bors commented Feb 28, 2025

☀️ Try build successful - checks-actions
Build commit: 9cdfd5d (9cdfd5df41f0c63cf60717083c177a4af5071687)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9cdfd5d): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 10.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
10.1% [10.1%, 10.1%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 10.1% [10.1%, 10.1%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary 0.0%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.2%] 10
Regressions ❌
(secondary)
0.1% [0.1%, 0.2%] 37
Improvements ✅
(primary)
-0.1% [-0.2%, -0.0%] 7
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.2%, 0.2%] 17

Bootstrap: 770.19s -> 771.304s (0.14%)
Artifact size: 362.00 MiB -> 361.96 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 28, 2025
@thaliaarchi
Copy link
Contributor Author

thaliaarchi commented Feb 28, 2025

No regressions, so are we good?

Also, it probably shouldn't be rollup=never.

@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from 0e82180 to d8b6fe8 Compare February 28, 2025 09:14
@thaliaarchi
Copy link
Contributor Author

I had only considered the io::Write::write_fmt side of things, but it would also be useful to extend this to fmt::Write::write_fmt for String and OsString. I've pushed a commit for these.

@thaliaarchi
Copy link
Contributor Author

I think the fmt::Write side is more likely to have an impact. Let’s see.

@bors try @rust-timer queue

@bors
Copy link
Collaborator

bors commented Feb 28, 2025

@thaliaarchi: 🔑 Insufficient privileges: not in try users

@rust-timer

This comment has been minimized.

@Noratrieb
Copy link
Member

rustdoc uses a lot of formatting, so this might have a positive impact there
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 28, 2025
@bors
Copy link
Collaborator

bors commented Feb 28, 2025

⌛ Trying commit d8b6fe8 with merge ba1e464...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 28, 2025
…e-fmt, r=<try>

Reserve before `write_fmt`

`fmt::Arguments::estimated_capacity()` is currently only used by `fmt::format()` to reserve the initial string capacity. Also use it for `impl Write` for `Vec<u8>`, `VecDeque<u8>`, `Cursor<&mut Vec<u8>>`, and `Cursor<Vec<u8>>`.

This may be worth checking perf.
@bors
Copy link
Collaborator

bors commented Feb 28, 2025

☀️ Try build successful - checks-actions
Build commit: ba1e464 (ba1e46459f988e4c6b39e6b34e656327b9e9fcd9)

@thaliaarchi thaliaarchi marked this pull request as ready for review March 22, 2025 00:18
Reserve before formatting with `fmt::Arguments::estimated_capacity()` in
`fmt::Write::write_fmt` and `io::Write::write_fmt` implementations for
owned buffer types.

Adding `#[inline]` to `write_fmt` shows minor perf regressions, so leave
it off like the default impl.
@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from 295e573 to b0a8187 Compare March 22, 2025 08:45
@workingjubilee
Copy link
Member

@workingjubilee you should probably just approve this one yeah? thanks.

@workingjubilee
Copy link
Member

yes I agree @workingjubilee, we should get this show on the road.

@bors r+ rollup

@bors
Copy link
Collaborator

bors commented May 1, 2025

📌 Commit b0a8187 has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 1, 2025
@workingjubilee
Copy link
Member

hmm well

@bors rollup=maybe

@workingjubilee
Copy link
Member

@bors rollup=never

There that's what I wanted to put back.

@thaliaarchi
Copy link
Contributor Author

thaliaarchi commented May 1, 2025

Huh. I never actually finished investigating whether the estimated_capacity heuristic is still good. I was overcomplicating my measuring strategy and lost interest. But, if the benchmarks look fine to you, it probably is; you can judge them better than me.

Sorry, I was thinking of this as waiting-on-author or draft, but hadn't marked it as such.

bors added a commit to rust-lang-ci/rust that referenced this pull request May 1, 2025
…e-fmt, r=workingjubilee

Reserve before `write_fmt` for owned buffers

`fmt::Arguments::estimated_capacity()` is currently only used by `fmt::format()` to reserve the initial string capacity. Also use it in `fmt::Write::write_fmt` for `String` and `OsString`; and `io::Write::write_fmt` for `Vec<u8>`, `VecDeque<u8>`, `Cursor<&mut Vec<u8>>`, and `Cursor<Vec<u8>>`.

This may be worth checking perf.
@bors
Copy link
Collaborator

bors commented May 1, 2025

⌛ Testing commit b0a8187 with merge 01d90ca...

@Noratrieb
Copy link
Member

I checked cachegrind on the biggest doc regression of the perf run and found this:

>  353,086,439  <alloc::string::String as core::fmt::Write>::write_fmt:???

> -190,119,423  <alloc::raw_vec::RawVecInner<_>>::reserve::do_reserve_and_handle::<alloc::alloc::Global>:???

>  152,807,999  alloc::raw_vec::finish_grow:???

>  148,657,316  alloc::raw_vec::RawVecInner<A>::reserve::do_reserve_and_handle:???

> -101,526,039  alloc::raw_vec::finish_grow::<alloc::alloc::Global>:???

>  -71,185,707  __memcpy_avx_unaligned_erms:???

>  -61,348,606  __rustc::__rust_realloc:???

>   51,174,119  alloc::fmt::format::format_inner:???

>   24,400,284  <alloc::string::String as core::fmt::Write>::write_str:???

>  -15,782,380  alloc::string::String::push:???

>   12,910,951  <alloc::string::String as core::fmt::Write>::write_char:???

>   -4,769,649  __rustc::__rust_alloc:???

as usual inlining noise makes it harder to understand, but it does look like it's spending more instructions (which of course don't necessarily have to mean that it's actually slower, but it's a sign) around the code you changed, generally (formatting and appending things to vecs). rustdoc uses a lot of formatting into strings

@thaliaarchi
Copy link
Contributor Author

Since this could have a lot of fallout if it's not right, we should pop it from the queue and I should write proper benchmarks, instead of leaning on compile benchmarks. The positive deltas in the Cachegrind results are more than the negative and it seems like it moved where the impact of the reserve happens (hence the different inlining?).

@thaliaarchi thaliaarchi marked this pull request as draft May 1, 2025 10:44
@thaliaarchi
Copy link
Contributor Author

I'm going to close it and reopen to cancel the merge.

@thaliaarchi thaliaarchi closed this May 1, 2025
@thaliaarchi thaliaarchi reopened this May 1, 2025
@Zalathar
Copy link
Contributor

Zalathar commented May 1, 2025

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 1, 2025
@Zalathar

This comment was marked as resolved.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 1, 2025
@Zalathar

This comment was marked as resolved.

@workingjubilee
Copy link
Member

It seemed Okay. If there was something I missed in my decision-making, it was probably because of the loss in context over time. I definitely don't want to ship something that you aren't happy with yet, obviously.

@thaliaarchi
Copy link
Contributor Author

This code is fine, yeah; the possible problem is in code it uses, estimated_capacity. I wasn't clear that I didn't consider this ready. Sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants