Skip to content

Add checks for void pointer types to ensure consistency #1775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 17, 2025

Conversation

sayantn
Copy link
Contributor

@sayantn sayantn commented Apr 13, 2025

For some intrinsics Intel specifies the pointer type as void* or void const*, but as Rust doesn't have void pointers, we need to choose a convention. This PR adds checks to ensure that convention.

This also fixes quite a few intrinsics (mainly in avx512vbmi2 and avx512f) that were not adhering to the conventions. The convention was chosen to maintain compatibility with SSE and AVX

@rustbot
Copy link
Collaborator

rustbot commented Apr 13, 2025

r? @Amanieu

rustbot has assigned @Amanieu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@jhorstmann
Copy link
Contributor

Thank you, I've been meaning to double check this too since stabilization of avx512 is getting closer. I probably added some inconsistency when contributing the masked load/store and compressstore intrinsics.

I remember a discussion that void* should map to *i8, but I think having this mapping depend on the name of the intrinsic would be more useful. Otherwise there would be little reason to have separate intrinsics for _mm_loadu_epi8/_mm_loadu_epi16/_mm_loadu_epi32/_mm_loadu_epi64 (all take a void const* parameter). I would be in favor of removing some of the special cases in the validation and using the suffix for all avx512 intrinsics (if there are inconsistencies with that rule for sse or avx intrinsics then we can't really fix them anymore).

@sayantn
Copy link
Contributor Author

sayantn commented Apr 14, 2025

@jhorstmann Currently there are 4 special cases

  • gather, scatter and compressstore take *u8
  • cvt{,s,us}epi{8,16,32,64}_store intrinsics take *i8 (this should be changed for epi{16,32,64} version)
  • mask{,z}_{load,store} and expandload with ps, pd and ph take *f32, *f64, *f16 respectively, which is consistent with the epi{8,16,32,64} versions
  • stream and stream_load intrinsics take with ps, pd, i32 and i64 take *f32, *f64, *i32 and *i64 respectively.

Among these, we can't change the convention of stream intrinsics, as some of them have already been stabilized. Same with gather and scatter. We can change the cvt_store and compressstore intrinsics, as they are avx512-exclusive.

My only concern with all these changes is that we might introduce a lot of compile failures, as loadu/storeu are very widely used. But again, these are nightly-only, with no guarantees

@sayantn sayantn force-pushed the voidptr-checks branch 2 times, most recently from 627a877 to c654682 Compare April 14, 2025 21:40
@sayantn
Copy link
Contributor Author

sayantn commented Apr 14, 2025

I have simplified the logic quite a bit, and made it more consistent. Now this PR only changes compressstore, cvt{,s,us}epixx_storeu, and _mm512_load{,u}_si512 intrinsics.

cc @jhorstmann @Amanieu

@jhorstmann
Copy link
Contributor

I like this much simpler and consistent logic! Would certainly welcome more users of these intrinsics to have a look though.

Among these, we can't change the convention of stream intrinsics, as some of them have already been stabilized. Same with gather and scatter.

The avx2 gather intrinsics look like they are typed according to the name suffix, it should only be the not yet stabilized avx512 gather/scatter instructions that are using *u8 and could still be changed.

It would cause compile errors in software I work on, but that's what I expect for using unstable features.

Too bad about the sse2 _mm_storeu_si/_mm_loadu_si intrinsics.

@sayantn
Copy link
Contributor Author

sayantn commented Apr 14, 2025

I believe the gather/scatter intrinsics should take *u8, because that conveys that the intrinsic can load from any address in vindex, without any regard to word boundaries or alignment. Having any other type would give the impression that it respects the type's alignment/word-boundary constraints.

One silver lining of the _mm_{load,store}u_si thing is that although they are not consistent with other intrinsics, they themselves are consistent with each other, all of them take *u8.

@jhorstmann
Copy link
Contributor

without any regard to word boundaries or alignment

I believe the same is true for the avx512 loadu/storeu intrinsics, their documentation states:

mem_addr does not need to be aligned on any particular boundary.

I agree that taking u8* for gather/scatter might make it more obvious that vindex is initially in units of bytes, but then that index is also multiplied by the scale parameter, which would usually match the size of the element type.

@sayantn
Copy link
Contributor Author

sayantn commented Apr 16, 2025

I feel like there is a fundamental difference between how gather/scatter uses the pointer vs how loadu/storeu does it. loadu/storeu say that they will read 512 bits from that address (or maybe less, due to masks, but they will read from that address only). That assumption doesn't work for gather/scatter - they might read/write anywhere, the pointer is just a convenient basepoint. But yeah, views differ, and I believe we should take opinions of more people who actually use avx512

@Amanieu
Copy link
Member

Amanieu commented Apr 16, 2025

My preference is towards having gather/scatter use the pointer type to indicate the type of data that is accessed for each element. The way the address is calculated can be explained in the documentation. There is precedent for this since we now have byte_add methods on pointers.

@sayantn
Copy link
Contributor Author

sayantn commented Apr 17, 2025

Ok I looked into this, and all stabilized gather/scatter intrinsics take pointer of element type (because Intel actually specified this, the xml file doesn't use void* for those). So ig we have to use pointer of element type in place of u8*

@Amanieu Amanieu added this pull request to the merge queue Apr 17, 2025
Merged via the queue into rust-lang:master with commit 7ae34fa Apr 17, 2025
60 checks passed
@sayantn sayantn deleted the voidptr-checks branch April 18, 2025 06:44
@bjorn3 bjorn3 mentioned this pull request Apr 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants