Description
Bugzilla Link | 41722 |
Version | 8.0 |
OS | MacOS X |
Reporter | LLVM Bugzilla Contributor |
CC | @TNorthover |
Extended Description
Multiple calls to tlv_get_addr are (often) generated per usage of a thread local variable on OSX. This issue was discovered by looking at the assembly generated by rustc, and is discussed in more detail here:
rust-lang/rust#60341 (comment)
I know very little about llvm - so hopefully this all makes sense. The linked IR 1 demonstrates the issue. Often, the optimizer spits out IR which references thread_local globals multiple times when the unoptimized IR only references them once. Often associated with getelementptr.
In the final assembly the asm does the tlv_get_addr dance twice.
movq _foo@TLVP(%rip), %rdi
callq *(%rdi)
For larger structures with multiple members, the problem gets worse, resulting in many redundant calls to tlv_get_addr. In contrast, when targeting linux, __tls_get_addr@PLT, is only invoked once.
Maybe there's a good reason the address isn't cached on OSX, but I'm hoping there isn't :).