Suboptimal codegen for potential [T; N]::zip()

Code taken from https://github.com/rust-lang/rust/pull/79451.

```rust
#![feature(min_const_generics, array_value_iter)]

use std::array::IntoIter;
use std::mem::MaybeUninit;

pub fn zip<T, U, const N: usize>(lhs: [T; N], rhs: [U; N]) -> [(T, U); N] {
    let mut dst = MaybeUninit::<[(T, U); N]>::uninit();
    let ptr = dst.as_mut_ptr() as *mut (T, U);
    for (idx, (lhs, rhs)) in IntoIter::new(lhs).zip(IntoIter::new(rhs)).enumerate() {
        unsafe { ptr.add(idx).write((lhs, rhs)) }
    }
    unsafe { dst.assume_init() }
}

pub fn zip_8xu64(lhs: [u64; 8], rhs: [u64; 8]) -> [(u64, u64); 8] {
    zip(lhs, rhs)
}

```

Godbolt (llvm-ir / asm): https://godbolt.org/z/Yq7W98

It seems that llvm is unable to eliminate the memcpys and thus results in suboptimal code.

Also there are dead stores which haven't been eliminated as well:

```llvm
store i64 8, i64* %_7.sroa.0.sroa.0.i.sroa.5.0..sroa_idx33, align 8
store i64 8, i64* %_7.sroa.0.sroa.5.0._7.sroa.0.0..sroa_cast.sroa_idx106.i, align 8
store i64 8, i64* %_7.sroa.0.sroa.0.i.sroa.4.0..sroa_idx31, align 8
store i64 8, i64* %_7.sroa.0.sroa.4.0._7.sroa.0.0..sroa_cast.sroa_idx104.i, align 8
```

A not quite equivalent c++ example produces "optimal" code where no memcpy/dead stores occurs: https://godbolt.org/z/sdfa13

EDIT:

On second thought, I'd assume that LLVM's GVN pass should have eliminated the memcpys but it seems that this isn't supported?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suboptimal codegen for potential [T; N]::zip() #79754

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Suboptimal codegen for potential [T; N]::zip() #79754

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions