Skip to content

AVX-512 intrinsics #146

Closed
Closed
@hdevalence

Description

@hdevalence

I tried getting an AVX-512 intrinsic to work and ran into a bunch of difficulties. Some points:

  • It looks like the combination of AVX512's masks and AVX512VL (which lets AVX512 instructions operate on 128/256bit vectors) means that for most instructions there's one C intrinsic for each of {no mask, write mask, zero mask} x {xmm, ymm, zmm}.

  • These would probably be good to generate with a macro?

  • Because AVX512 uses mask registers, the constify! macro hacks are probably not needed for mask instructions.

  • The list of intrinsics linked in the readme doesn't seem to have non-masked versions; I don't know if this is just an accident of how it was made.

  • Trying to use the int_x86_avx512_mask_pmul_dq_512 intrinsic from that list using

#[link_name = "llvm.x86.avx512.mask.pmul.dq.512"]
fn mask_pmul_dq_512(a: i32x16, b: i32x16, src: i64x8, k: i8) -> i64x8;

didn't work, failing with

rustc: /checkout/src/llvm/include/llvm/Support/Casting.h:236: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = llvm::VectorType; Y = llvm::Type; typename llvm::cast_retty<X, Y*>::ret_type = llvm::VectorType*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

which I guess means I was linking to the intrinsic incorrectly?

@alexcrichton reduced to this minimal example for the vpmuldq instruction: https://godbolt.org/g/VMCtYy and found https://github.com/rust-lang/rust/blob/4c053db233d69519b548e5b8ed7192d0783e582a/src/librustc_trans/cabi_x86_64.rs#L30-L31 which hardcodes the biggest vector as 256 bits (the size of a ymm register).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions