Description
Consider this code (Godbolt link):
pub struct S([i32; 16]);
#[inline(never)]
fn g(x: S) {
println!("{}", x.0[3]);
}
#[inline(never)]
fn f(x: S) {
g(x)
}
pub fn main() {
f(S([0; 16]))
}
LLVM can't eliminate the memcpy between f and g, even though it should legally be able to do so. This is because memcpyopt can only forward to parameters marked byval. We don't use the byval attribute, and this seems to be by design. But we lose this optimization, which I've observed hurting codegen in several places even in hello world (core::fmt::Write::write_fmt
, core::panicking::assert_failed()
, <core::fmt::Arguments as core::fmt::Display>::fmt
). I suspect losing this optimization hurts us all over the place.
There are two solutions I can see:
(1) Use byval for indirect arguments in the Rust ABI.
(2) Change LLVM to allow the optimization to happen for at least nocapture noalias readonly
parameters. Since nocapture
implies that the behavior of the callee doesn't depend on the address and noalias readonly
implies that the memory is strongly immutable, this should work. We mark all indirect by-value arguments as nocapture noalias
already.
I'm working on a patch for (2), but I was wondering why we can't just do (1).