Skip to content

_mm512_shrdv_* intrinsics have incorrect argument order #130365

Closed
@as-com

Description

@as-com

I tried this code:

#![feature(stdarch_x86_avx512)]

use std::arch::x86_64::*;

fn main() {
    unsafe {
        let a = _mm512_set1_epi32(0xffff);
        let b = _mm512_setzero_epi32();
        let c = _mm512_set1_epi32(1);
    
        let dst = _mm512_shrdv_epi32(a, b, c);
        println!("{}", _mm512_cvtsi512_si32(dst));    
    }
}

I expected to see this happen:

The code produces the same output as the equivalent C program:

#include <immintrin.h>
#include <stdio.h>

int main() {
    __m512i a = _mm512_set1_epi32(0xffff);
    __m512i b = _mm512_setzero_epi32();
    __m512i c = _mm512_set1_epi32(1);

    __m512i dst = _mm512_shrdv_epi32(a, b, c);
    printf("%u\n", _mm512_cvtsi512_si32(dst));
}

The program outputs 32767.

Instead, the Rust program outputs -2147483648.


Intel's documentation (as linked in the rustdoc for the function) for _mm512_shrdv_epi32 states:

Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst.

FOR j := 0 to 15
	i := j*32
	dst[i+31:i] := ((b[i+31:i] << 32)[63:0] | a[i+31:i]) >> (c[i+31:i] & 31)
ENDFOR
dst[MAX:512] := 0

meaning argument b is the upper bits, and a is the lower bits. However, llvm.fshr.* uses the opposite order. It appears Rust is passing arguments a, b, and c in that order to llvm.fshr:

https://github.com/rust-lang/stdarch/blob/b1edbf90955cb9b057a323f761e2c19edb591e6f/crates/core_arch/src/x86/avx512vbmi2.rs#L997-L999

This likely also applies to all similar intrinsics that call llvm.fshr.

Meta

rustc --version --verbose:

rustc 1.83.0-nightly (0609062a9 2024-09-13)
binary: rustc
commit-hash: 0609062a91c8f445c3e9a0de57e402f9b1b8b0a7
commit-date: 2024-09-13
host: x86_64-unknown-linux-gnu
release: 1.83.0-nightly
LLVM version: 19.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-SIMDArea: SIMD (Single Instruction Multiple Data)A-intrinsicsArea: IntrinsicsC-bugCategory: This is a bug.O-x86_64Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions