Closed
Description
Code taken from #79451.
#![feature(min_const_generics, array_value_iter)]
use std::array::IntoIter;
use std::mem::MaybeUninit;
pub fn zip<T, U, const N: usize>(lhs: [T; N], rhs: [U; N]) -> [(T, U); N] {
let mut dst = MaybeUninit::<[(T, U); N]>::uninit();
let ptr = dst.as_mut_ptr() as *mut (T, U);
for (idx, (lhs, rhs)) in IntoIter::new(lhs).zip(IntoIter::new(rhs)).enumerate() {
unsafe { ptr.add(idx).write((lhs, rhs)) }
}
unsafe { dst.assume_init() }
}
pub fn zip_8xu64(lhs: [u64; 8], rhs: [u64; 8]) -> [(u64, u64); 8] {
zip(lhs, rhs)
}
Godbolt (llvm-ir / asm): https://godbolt.org/z/Yq7W98
It seems that llvm is unable to eliminate the memcpys and thus results in suboptimal code.
Also there are dead stores which haven't been eliminated as well:
store i64 8, i64* %_7.sroa.0.sroa.0.i.sroa.5.0..sroa_idx33, align 8
store i64 8, i64* %_7.sroa.0.sroa.5.0._7.sroa.0.0..sroa_cast.sroa_idx106.i, align 8
store i64 8, i64* %_7.sroa.0.sroa.0.i.sroa.4.0..sroa_idx31, align 8
store i64 8, i64* %_7.sroa.0.sroa.4.0._7.sroa.0.0..sroa_cast.sroa_idx104.i, align 8
A not quite equivalent c++ example produces "optimal" code where no memcpy/dead stores occurs: https://godbolt.org/z/sdfa13
EDIT:
On second thought, I'd assume that LLVM's GVN pass should have eliminated the memcpys but it seems that this isn't supported?