-
Notifications
You must be signed in to change notification settings - Fork 288
Remove requirement on neon feature for arm #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
Ya know that's a good point, we don't gate (but I don't have a deep knowledge, so there may be an explanation) |
The removed comment indicates a codegen issue, which I have no idea how to validate (I couldn't quite figure out what the tests do). I just removed the |
It appears the |
I've been playing around with this on godbolt, comparing with clang's output. C++ (optimized): https://godbolt.org/z/babdbh I'm not sure, but my guess is that rustc is generating slightly different IR and LLVM is failing to optimize it. Both produce sequences of |
Rust is passing the vector by reference instead of by value because the NEON target feature is not enabled globally. Specifically this paragraph in the RFC:
|
Thanks, that completely clears that up! Testing a somewhat contrived example (https://rust.godbolt.org/z/f4acxb) My uneducated guess is that this is appearing here because the If this is all expected, is it appropriate to expand the 22 instruction limit (or some other solution)? |
Yes, enabling the Rather than increasing the instruction limit I think you can just restore |
392bb96
to
6a57efd
Compare
Ok, that seems reasonable, I removed that change. |
As I noted in #889, the
neon
target feature shouldn't be required to use NEON on ARM (it should be possible to detect it at runtime).