Skip to content

Optimize the size of a statically linked binary and library #10740

Closed
@alexcrichton

Description

@alexcrichton

Once #10528 lands, we'll be able to create static libraries and static binaries. While being very useful, we're creating massive binaries. There are a few opportunities for improvement that I can see here:

  • Static executables and static libraries contain the metadata sections of their dependent libraries. These are certainly not needed, and these sections should be removed (or possibly this is a good argument for putting the metadata in a separate file?). This would in theory be solved with objcopy -R, but objcopy doesn't exist by default on OSX, and the objcopy I found ended up producing a corrupted executable that didn't run.
  • We don't necessarily want to pull in all of libstd. There are likely vast portions of libstd which are not used in a crate which can all get removed. This involves eliminating unused functions and data which is not used. C/C++ solve this with -ffunction-sections and -fdata-sections which places each function and static global in its own section. The linker is then passed --gc-sections and magically removes everything that's unused.

Both of these optimizations are a little dubious, and this is why I chose the default output of libraries to be dynamic libraries for the compiler. These optimizations can benefit the size of an executable, but I've seen the compilation of fn main() {} increase by 5-10x when implementing these optimizations (even in the common no-opt compile case).

Additionally, these optimizations are going to be difficult to implement across platforms. Most of what I've described is linux-specific. There is a -dead_strip option on the OSX linker, but that's the only relevant size optimization flag I can find. I have not checked to see what the mingw linker provides in terms of size optimizations.

Empirical data

All of the data here is collected from a 32-bit ubuntu VM, but I imagine the numbers are very similar on other platforms. The program in question is simply fn main() {}.

  • Statically linked executable - 6.9MB
  • Removing metadata - 2.7MB
  • -ffunction-sections + --gc-sections - 1.6MB
  • -ffunction-sections + --gc-sections + #[no_uv] - 730K

Note that --gc-sections always removes the metadata. I'm unsure of whether --gc-sections corrupts our exception-handling sections.

From this, the "most optimized normal case" that I can get to is 1.6MB, which is still very large. As a comparison, the "hello world" go executable is 400K. A no_uv 730K executable is pretty reasonable, so it could just be that having M:N/uv means that you're pulling in larger portions of libstd. I believe that this size of 1.6MB means that further investigation is warranted to figure out where all this size is coming from.

Nominating for discussion. I don't think that this should block 1.0, but this is certainly a concern that we should prioritize.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: Code generationA-linkageArea: linking into static, shared libraries and binariesE-hardCall for participation: Hard difficulty. Experience needed to fix: A lot.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions