Closed
Description
The implementation in core::arch
for _mm512_set4_epi64
is
pub unsafe fn _mm512_set4_epi64(d: i64, c: i64, b: i64, a: i64) -> __m512i {
let r = i64x8::new(d, c, b, a, d, c, b, a);
transmute(r)
}
so the first argument provided becomes the first lane.
However, the Intel Intrinsics Guide defines it as
__m512i _mm512_set4_epi64 (__int64 d, __int64 c, __int64 b, __int64 a)
dst[63:0] := a
dst[127:64] := b
dst[191:128] := c
dst[255:192] := d
dst[319:256] := a
dst[383:320] := b
dst[447:384] := c
dst[511:448] := d
dst[MAX:512] := 0
which means that the last argument provided becomes the first lane.
The implementation for _mm512_set_epi64
is correct though, which leads to a disparity between _mm512_set4_epi64
and _mm512_set_epi64
that doesn't exist in C. I've created this gist to show this difference between C and Rust.
Metadata
Metadata
Assignees
Labels
No labels