`array::zip` in combination with `array::map` optimises very poorly

Today for no reason in particular I was curious about the behaviour of arrays when zipped together and then mapped into a single array, roughly:
```rs
fn zip_with<T, U, V, F: FnMut(T, U) -> V, const N: usize>(a: [T; N], b: [U; N], mut f: F) -> [V; N] {
    a.zip(b).map(|(a, b)| f(a, b))
}
```
I wrote some tests for this and disassembled them in Godbolt to see what was going on behind the scenes and wow was a whole lot going on that seemed like it didn't need to be. So I wrote something kinda like what's in the stdlib for mapping and zipping arrays, just without the iterators, to see if I could do any better. Turns out I absolutely can:
```rs
struct ZipWithGuard<T, U, const N: usize> {
    a: *mut [T; N],
    b: *mut [U; N],
    moved: usize,
}

impl<T, U, const N: usize> Drop for ZipWithGuard<T, U, N> {
    fn drop(&mut self) {
        for i in self.moved + 1..N {
            unsafe {
                drop(self.a.cast::<T>().add(i).read());
                drop(self.b.cast::<U>().add(i).read());
            }
        }
    }
}

pub fn zip_with_guarded<T, U, V, F: FnMut(T, U) -> V, const N: usize>(
    a: [T; N],
    b: [U; N],
    mut f: F,
) -> [V; N] {
    let aref = &a as *const _ as *mut [ManuallyDrop<T>; N];
    forget(a);
    let bref = &b as *const _ as *mut [ManuallyDrop<U>; N];
    forget(b);
    let mut guard = ZipWithGuard {
        a: aref,
        b: bref,
        moved: 0,
    };
    let mut out: MaybeUninit<[V; N]> = unsafe { MaybeUninit::uninit().assume_init() };
    let out_ptr = out.as_mut_ptr().cast::<MaybeUninit<V>>();
    while guard.moved < N {
        let i = guard.moved;
        unsafe {
            *out_ptr.add(i) = MaybeUninit::new(f(
                ptr::read(guard.a.cast::<T>().add(i)),
                ptr::read(guard.b.cast::<U>().add(i)),
            ));
        }
        guard.moved += 1;
    }
    unsafe { out.assume_init() }
}
```
Maybe the design isn't as optimal (or as safe) as it could be, but it performs perfectly well in benchmarks (read: as fast as an unguarded version and anywhere between 3-25\* times faster than the `zip_with` definition shown at the top of this issue. Here's a rough table of execution times averaged over the primitive ops (`+`, `-`, `*`, `&`, `|`, `^`):
![image](https://user-images.githubusercontent.com/25749440/197906770-85c15830-98a0-4390-8e7d-f79c47d5c5a0.png)

Not really liking how things are looking for `.zip().map()`. So, question then is: is there something in Rust that needs to change to let these optimisations happen, or do I need to look at putting this in a library somewhere (or maybe stdlib wink wink nudge nudge?)?

\*The 25x comes from the `[u8; 8]` benchmarks. `zip_with` benches at 11.230ns, and `zip_with_guarded` benches at 452.593ps, which seems suspicious but I can't seem to find anything wrong with the result.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`array::zip` in combination with `array::map` optimises very poorly #103555

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

array::zip in combination with array::map optimises very poorly #103555

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`array::zip` in combination with `array::map` optimises very poorly #103555