Description
LLVM's code coverage instrumentation has a mode called "runtime counter relocation" (see https://clang.llvm.org/docs/SourceBasedCodeCoverage.html). The details are not important, but one effect is that it inserts a weak symbol definition into each instrumented module called __llvm_profile_counter_bias
.
We can see it in action with this sample code (foo.rs):
#[no_mangle]
pub extern "C" fn foo() { bar::bar() }
mod bar {
#[inline(never)]
pub fn bar() {}
}
$ rustc foo.rs -Ccodegen-units=2 -Copt-level=0 --crate-type=rlib -Cembed-bitcode=no -Cinstrument-coverage -Cllvm-args=-runtime-counter-relocation && ar x libfoo.rlib && nm foo*.rcgu.o
foo.foo.730f9a7e513a85b2-cgu.0.rcgu.o:
U _RNvNtCs9SsW9UZxs52_3foo3bar3bar
0000000000000000 V __covrec_5CF8C24CDB18BDACu
0000000000000000 V __llvm_profile_counter_bias
0000000000000000 R __llvm_profile_filename
0000000000000000 n __profc_foo
0000000000000000 T foo
foo.foo.730f9a7e513a85b2-cgu.1.rcgu.o:
0000000000000000 T _RNvNtCs9SsW9UZxs52_3foo3bar3bar
0000000000000000 V __covrec_BC9F6BE8A43E337Cu
0000000000000000 V __llvm_profile_counter_bias
0000000000000000 R __llvm_profile_filename
0000000000000000 n __profc__RNvNtCs9SsW9UZxs52_3foo3bar3bar
Note the weak definitions (V
) of __llvm_profile_counter_bias
in both CGUs.
Now, if we increase the optimization level:
$ rm -f libfoo.rlib foo*.rcgu.o
$ rustc foo.rs -Ccodegen-units=2 -Copt-level=1 --crate-type=rlib -Cembed-bitcode=no -Cinstrument-coverage -Cllvm-args=-runtime-counter-relocation && ar x libfoo.rlib && nm foo*.rcgu.o
foo.foo.730f9a7e513a85b2-cgu.0.rcgu.o:
U _RNvNtCs9SsW9UZxs52_3foo3bar3bar
0000000000000000 V __covrec_5CF8C24CDB18BDACu
U __llvm_profile_counter_bias
0000000000000000 n __profc_foo
0000000000000000 T foo
foo.foo.730f9a7e513a85b2-cgu.1.rcgu.o:
0000000000000000 T _RNvNtCs9SsW9UZxs52_3foo3bar3bar
0000000000000000 V __covrec_BC9F6BE8A43E337Cu
0000000000000000 V __llvm_profile_counter_bias
0000000000000000 R __llvm_profile_filename
0000000000000000 n __profc__RNvNtCs9SsW9UZxs52_3foo3bar3bar
Note that __llvm_profile_counter_bias
is now an undefined (U
) symbol in the first CGU, and defined in the other.
It's not deterministic in which CGU the definition ends up in. Running the command repeatedly will show it either in 0.rcgu.o
or 1.rcgu.o
.
This seems to be due to an optimization figuring that since the weak definitions are all the same and will get linked together, just keeping one across the CGUs is enough. I haven't been able to find the code that does this. Does anyone know where this transformation is implemented?
That optimization seems reasonable, however the introduction of the undefined symbol can affect how a program gets linked. Consider:
$ echo 'extern void foo(); int main() { foo(); }' > main1.c
$ echo 'int main() {}' > main2.c
$ clang -c main1.c
$ clang -c main2.c -fprofile-instr-generate -fcoverage-mapping -mllvm -runtime-counter-relocation && ar r main2.a main2.o
$ clang -fuse-ld=lld main1.o main2.a libfoo.rlib
ld.lld: error: duplicate symbol: main
>>> defined at main1.c
>>> main1.o:(main)
>>> defined at main2.c
>>> main2.o:(.text+0x0) in archive main2.a
clang: error: linker command failed with exit code 1 (use -v to see invocation)
What's happening here is that because main1.o
references foo
, 0.rcgu.o
gets linked in. And because that has an undefined reference to __llvm_profile_counter_bias
, main2.o
, which has the first definition of that symbol that the linker saw, gets linked in --- which it wouldn't have been otherwise because it's in a static library.
In other words, because rustc turned __llvm_profile_counter_bias
into an undefined symbol in the first CGU, the linker linked different code than it otherwise would have. If rustc hadn't done that (e.g. with -Copt-level=0
) main2.o
would not have been linked in, and the link would have succeeded. If the undefined symbol had gone in the second CGU (the one which doesn't define foo
), which it does sometimes, the link would also have succeeded.
In this case, the transformation caused the link to fail, but it seems possible to construct a case where it changes program behavior.
This makes me think that the transformation, while it seems valid locally when considering just the CGUs, is not valid in the wider context of linking a whole program or library.
I was using:
$ rustc --version --verbose
rustc 1.76.0 (07dca489a 2024-02-04)
binary: rustc
commit-hash: 07dca489ac2d933c78d3c5158e3f43beefeb02ce
commit-date: 2024-02-04
host: x86_64-unknown-linux-gnu
release: 1.76.0
LLVM version: 17.0.6
but I believe this also happens with the latest versions (of both rustc and llvm). We first hit this in Chromium: https://crbug.com/324126269