Skip to content

Complete avx512vbmi2 #1279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2022
Merged

Complete avx512vbmi2 #1279

merged 1 commit into from
Feb 6, 2022

Conversation

minybot
Copy link
Contributor

@minybot minybot commented Jan 30, 2022

expandloadu_epi16,epi8
mask_compressstoreu_epi16,epi8

@rust-highfive
Copy link

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

@jhorstmann
Copy link
Contributor

Nice, I also had those on my todo list but did not get around to implementing them yet. Note that there are also llvm intrinsics for expand load (examples 1, 2) which might lead to a bit simpler code and automatically support 32-bit pointers.

@minybot
Copy link
Contributor Author

minybot commented Feb 3, 2022

Nice, I also had those on my todo list but did not get around to implementing them yet. Note that there are also llvm intrinsics for expand load (examples 1, 2) which might lead to a bit simpler code and automatically support 32-bit pointers.

Thanks. I will check those links.
After this, I will try to finish avx512bw left ones if I have time during the weekend.
Thanks for your help for avx512f.

@Amanieu Amanieu merged commit 75e32ca into rust-lang:master Feb 6, 2022
@minybot
Copy link
Contributor Author

minybot commented Feb 6, 2022

Nice, I also had those on my todo list but did not get around to implementing them yet. Note that there are also llvm intrinsics for expand load (examples 1, 2) which might lead to a bit simpler code and automatically support 32-bit pointers.

I check LLVM code. We need to use asm! for expandload because there is "i1".
llvm.masked.expandload.v32i16(i16* %1, <32 x i1> %2, <32 x i16> %0)

@minybot minybot deleted the avx512vbmi2 branch February 6, 2022 17:02
@jhorstmann
Copy link
Contributor

There are two variants, the architecture independent ones with i1 masks and the avx512 specific ones, like llvm.x86.avx512.mask.expand.load.w.512(i8* %addr, <32 x i16> %data, i32 %mask). I don't know whether the avx512 specific ones are deprecated, that would probably make asm the better option.

@minybot
Copy link
Contributor Author

minybot commented Feb 6, 2022

There are two variants, the architecture independent ones with i1 masks and the avx512 specific ones, like llvm.x86.avx512.mask.expand.load.w.512(i8* %addr, <32 x i16> %data, i32 %mask). I don't know whether the avx512 specific ones are deprecated, that would probably make asm the better option.

In this situation, what should we do? If there is a LLVM one, I prefer the LLVM one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants