-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Ensure strings created with const_str
get the unnamed_addr
attribute
#117930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(rustbot has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Ensure strings created with `const_str` get the `unnamed_addr` attribute This function (`const_str`) is only used when we need to invent a string during codegen -- for example, for a panic message to pass when codegening some of the assert/panic/etc terminators (for stuff like divide by zero). AFAICT all other consts, such as the user-defined ones from const eval, should already be getting this attribute (things that come from a ConstAllocation do, for example). Which means that the "unnamed" part is even more true than usual here, these aren't strings that even exist as far as the user can tell. Setting this attribute allows LLVM to merge these constants, leading to significant binary size savings (much more than I would expect). On x86_64-unknown-linux-gnu, t takes a build of ripgrep (release without debug info) from 9.7MiB to 6.0MiB (a savings of over 30%!?), and a build of rustc_driver's shared object from ~123MiB to ~112MiB (less drastic, but still over 10% reduced). The effect on ripgrep is substantially reduced on macOS for reasons beyond me (I may have fucked up the test), only saving around 0.2MiB, although rustc_driver is still around 10MB or smaller than it had been previously. This raises some questions, such as "does that mean 1/3 of ripgrep was made of division by zero complaints?" I'm not sure, that may be the case. The output of `strings path/to/rg` is ~2MB smaller, so it seems like a lot of it was. Allowing these to be merged presumably also allow functions that contain them to be merged (if the addresses had semantic meaning, then it stands). I intend to do some more analysis here, but I got this up as soon as I realized that this attribute was only missing for internal const strings, and all other ones already get it.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (4f2e555): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 674.6s -> 672.542s (-0.31%) |
@rustbot author Somewhat confused by those results, I'll look into it (may not have time until the weekend). Obviously if this doesn't improve things then it's not worth doing. |
I think it's still worth doing regardless size saving, just to get more consistency with all the constants. |
I have limited time for the rest of the week, so I can't investigate too deeply until the weekend. But this seems to only reduce the unstripped size. It looks like they're roughly the same size after No theories on my end, really, although in my estimation that still makes it useful (although I do feel somewhat dumb). |
The use of |
Yeah, I mentioned that. The confusing thing is what actually is being reduced and why it's so much (for unstripped). |
What numbers are you seeing in these cases, on which ripgrep revision, and are they different from https://perf.rust-lang.org/compare.html?start=698fcc8219e6dd690b148a23af10a0e5747621fe&end=4f2e555b4dbf81546d145f9f1bc9e1e99da7acfd&stat=size%3Alinked_artifact&name=ripgrep&showRawData=true&nonRelevant=true? (Note that rustc-perf doesn't test the most recent revision -- but I can reproduce its "size:linked_artifact" results locally on its revision of ripgrep). Locally on the latest ripgrep revision, I'm also not seeing a big difference myself (-15k in debug, +15k in release; with the default ripgrep settings, that is with debuginfo in release). |
Ripgrep revision 7099e174acbcbd940f57e4ab4913fee4040c826e, built from source in the repo, after patching the Cargo.toml to disable
For measuring the size of the rustc_driver library, I was comparing between the size of the versions in stage1 ( |
And your ripgrep numbers on that commit are from "9.7MiB to 6.0MiB", right? I'm locally seeing a <3KB difference with this PR -rwxrwxr-x 1 lqd lqd 10147280 Nov 15 19:30 rg-release-defaultdebuginfo-master*
-rwxrwxr-x 1 lqd lqd 10144328 Nov 15 19:30 rg-release-defaultdebuginfo-pr117930* |
Hmmm... my test methodology is just:
I can check on a machine other than the cloud desktop over the weekend. Edit: Here's the information for the nightly rustc I'm comparing against.
|
Yeah you're risking comparing different configs: you may also have std's debuginfo disabled in |
Ahh, that probably explains it. I'll be more careful about that in the future. I didn't realize that you end up with std's debuginfo for a normal Well, I suppose this PR is considerably less worthwhile then, although I'm somewhat glad there isn't such a low hanging fruit still. That said, given that we do emit this attribute on every other constant, I think the PR is still probably worthwhile. |
@rustbot ready |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (48d8100): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 675.534s -> 675.809s (0.04%) |
This function (
const_str
) is only used when we need to invent a string during codegen -- for example, for a panic message to pass when codegening some of the assert/panic/etc terminators (for stuff like divide by zero).AFAICT all other consts, such as the user-defined ones from const eval, should already be getting this attribute (things that come from a ConstAllocation do, for example). Which means that the "unnamed" part is even more true than usual here, these aren't strings that even exist as far as the user can tell.
Setting this attribute allows LLVM to merge these constants, leading to significant binary size savings (much more than I would expect). On x86_64-unknown-linux-gnu, t takes a build of ripgrep (release without debug info) from 9.7MiB to 6.0MiB (a savings of over 30%!?), and a build of rustc_driver's shared object from 123MiB to 112MiB (less drastic, but still over 10% reduced).The effect on ripgrep is substantially reduced on macOS for reasons beyond me (I may have fucked up the test), only saving around 0.2MiB, although rustc_driver is still around 10MB or smaller than it had been previously.This raises some questions, such as "does that mean 1/3 of ripgrep was made of division by zero complaints?" I'm not sure, that may be the case. The output ofstrings path/to/rg
is ~2MB smaller, so it seems like a lot of it was. Allowing these to be merged presumably also allow functions that contain them to be merged (if the addresses had semantic meaning, then it stands).I intend to do some more analysis here, but I got this up as soon as I realized that this attribute was only missing for internal const strings, and all other ones already get it.Edit: The wins are much more marginal, but there's some argument to do this for the sake of consistency.