Description
Originally reported at https://lists.llvm.org/pipermail/llvm-dev/2019-December/137403.html
After https://reviews.llvm.org/D34195 , we retain to
unconditionally for a symbol assignment --defsym from=to
(The expectation is that there may be some code referencing to
.) This is overly conservative. If from
is not retained, we don't actually have to retain to
.
% cat a.s
.globl _start, foo, bar
.text; _start: movabs $d, %rax
.section .text_foo,"ax"; foo: ret
.section .text_bar,"ax"; bar: nop
% as a.s -o a.o
% ld.lld a.o --defsym c=foo --defsym d=1 --gc-sections -o a
# .text_foo is retained, though `c` is not actually used.
It is difficult to drop the following rule in MarkLive.cpp
:
for (StringRef s : script->referencedSymbols)
markSymbol(symtab->find(s));
The issue can be demonstrated by the following call tree:
LinkerDriver::link
markLive
...
resolveReloc
// Defined::section is nullptr for `d` because the assignment d=foo hasn't been evaluated yet.
writeResult
Writer<ELFT>::run
Writer<ELFT>::finalizeSections
LinkerScript::processSymbolAssignments
// Symbol section+offset are evaluated here.
The placement of processSymbolAssignments
is a bit flexible. It can be reorder to just after processSectionCommands
. Adding another processSymbolAssignments
can solve the problem, but we need to be very careful adding more passes. There are lots of intricate dependencies. We need to find a balance between benefits/costs.
This enhancement is probably of low priority. It may not worth a fix on its own, but if reordering passes can address other problems, it would be nice to improve --defsym as well.
I have also studies GNU ld's behavior. It turns out that it is not reliable in some cases:
ld.bfd a.o --defsym 'd=foo' --gc-sections -o a => GNU ld retains .text_foo
ld.bfd a.o --defsym 'd=foo+3' --gc-sections -o a => GNU ld drops .text_foo
ld.bfd a.o --defsym 'd=bar-bar+foo' --gc-sections -o a => GNU ld drops .text_foo