Description
I tried this code:
#![feature(stdarch_x86_avx512)]
use std::arch::x86_64::*;
fn main() {
unsafe {
let a = _mm512_set1_epi32(0xffff);
let b = _mm512_setzero_epi32();
let c = _mm512_set1_epi32(1);
let dst = _mm512_shrdv_epi32(a, b, c);
println!("{}", _mm512_cvtsi512_si32(dst));
}
}
I expected to see this happen:
The code produces the same output as the equivalent C program:
#include <immintrin.h>
#include <stdio.h>
int main() {
__m512i a = _mm512_set1_epi32(0xffff);
__m512i b = _mm512_setzero_epi32();
__m512i c = _mm512_set1_epi32(1);
__m512i dst = _mm512_shrdv_epi32(a, b, c);
printf("%u\n", _mm512_cvtsi512_si32(dst));
}
The program outputs 32767.
Instead, the Rust program outputs -2147483648.
Intel's documentation (as linked in the rustdoc for the function) for _mm512_shrdv_epi32
states:
Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst.
FOR j := 0 to 15 i := j*32 dst[i+31:i] := ((b[i+31:i] << 32)[63:0] | a[i+31:i]) >> (c[i+31:i] & 31) ENDFOR dst[MAX:512] := 0
meaning argument b
is the upper bits, and a
is the lower bits. However, llvm.fshr.*
uses the opposite order. It appears Rust is passing arguments a, b, and c in that order to llvm.fshr
:
This likely also applies to all similar intrinsics that call llvm.fshr
.
Meta
rustc --version --verbose
:
rustc 1.83.0-nightly (0609062a9 2024-09-13)
binary: rustc
commit-hash: 0609062a91c8f445c3e9a0de57e402f9b1b8b0a7
commit-date: 2024-09-13
host: x86_64-unknown-linux-gnu
release: 1.83.0-nightly
LLVM version: 19.1.0