Description
I was excited to see #[used]
stabilized (yay!) as one of the issues we suffer from in wasm-bindgen
is related to symbols being removed. Unfortunately though #[used]
doesn't solve our use case!
First I'll try to explain our issue a bit. The #[wasm_bindgen]
attribute allows you to import JS functionality into a Rust program. This doesn't work, however, when you import JS functions into a private Rust submodule. (aka mod foo { ... }
). When importing a function we also generate an internal exported function which the CLI wasm-bindgen
tool uses (and then removes), but it suffices to say that we're generating code that looks like:
mod private {
#[no_mangle]
pub extern fn foo() { /* ... */ }
}
Today the symbol foo
is not considered alive by rustc itself as it's not reachable. As a result, it's not even translated into the object file. If we instead change this though:
#![feature(used)]
mod private {
#[no_mangle]
pub extern fn foo() {}
#[used]
static F: extern fn() = foo;
}
This still doesn't work! Unfortunately for us the #[used]
works as intended but doesn't affect the symbol visibility. The above program generates this IR:
; ModuleID = 'playground.7pbp0xok-cgu.0'
source_filename = "playground.7pbp0xok-cgu.0"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
@_ZN10playground7private1F17hb0dc3802d85fadd7E = internal constant <{ i8*, [0 x i8] }> <{ i8* bitcast (void ()* @foo to i8*), [0 x i8] zeroinitializer }>, align 8
@llvm.used = appending global [1 x i8*] [i8* bitcast (<{ i8*, [0 x i8] }>* @_ZN10playground7private1F17hb0dc3802d85fadd7E to i8*)], section "llvm.metadata"
; Function Attrs: norecurse nounwind readnone uwtable
define internal void @foo() unnamed_addr #0 {
start:
ret void
}
attributes #0 = { norecurse nounwind readnone uwtable "probe-stack"="__rust_probestack" }
the problem here is that the symbol foo
, while not mangled, is still marked as internal
. This in turns means that it does indeed reach the linker, but for our purposes in wasm-bindgen
we need it to survive the linker, not just reach the linker.
Ok so that's the problem statement for wasm-bindgen
, but you can generalize it today for rustc by asking: what does #[used]
do to symbol visibility? The overall story for symbol visibility in rustc is a little muddied and not always great (especially on ABI-particulars like #[no_mangle]
things).
What should the symbol visibility of foo
be here?
mod private {
#[no_mangle]
pub extern fn foo() {}
#[used]
static F: extern fn() = foo;
}
We've always had a basic rule of thumb in Rust that "reachable symbols" have non-internal visibility, but it's not clear what to do here. foo
is indeed a reachable symbol because of #[used]
, but it's in a private module. Does that mean because of pub
and #[no_mangle]
it shouldn't have internal
visibility? Should only #[no_mangle]
imply that? It's unclear to me!
I'd naively like to send a patch that makes foo not-internal
because it has #[no_mangle]
and pub
(not that it's "publicly reachable"). I think though that this may be deeper in the compiler. I just took a look at how #[used]
works, and it's actually a little suprising!
In src/librustc_mir/monomorphize/collector.rs
we attempt to not translate anything not reachable in a crate as a form of DCE. I didn't find any handling of #[used]
, though, and it turns out we unconditionally translate all statics all the time! Then becuase we put it in llvm.used
it ends up not getting gc'd by LLVM.
I think that we may want to future-proof this by updating the src/librustc/middle/reachable.rs
collection step to basically push #[used]
statics onto the worklist to process. The initial worklist is seeded with all items that are public by visibility, and I think we could change it to also be seeded with any #[used]
statics. This means that anything referenced by a #[used]
static will be pulled in as a result.
Do others think this is a reasonable strategy for having #[used]
affect symbol visibility?
cc @michaelwoerister
cc @fitzgen
cc @japaric