Description
Some stdarch Neon tests have started failing, I think due to a code generation error. An example is test_vld2q_dup_f32
, and this is the first one hit with an unfiltered cargo test
.
I've not yet been able to reduce the stdarch test down to a standalone example. (At least, not one that also triggers this bug.) Inside the Docker container that stdarch uses for its tests (ci/run-docker.sh armv7-unknown-linux-gnueabihf
):
$ export RUSTFLAGS="-D warnings -Z merge-functions=disabled -Ctarget-feature=+neon"
$ cargo test --manifest-path=crates/core_arch/Cargo.toml --target=armv7-unknown-linux-gnueabihf --release test_vld2q_dup_f32
...
failures:
---- core_arch::arm_shared::neon::generated::test::test_vld2q_dup_f32 stdout ----
thread 'core_arch::arm_shared::neon::generated::test::test_vld2q_dup_f32' panicked at 'assertion failed: `(left == right)`
left: `[f32x4(1.9991541, 1.9991536, 1.0, 1.0), f32x4(0.0, 0.0, 1.0, 1.0)]`,
right: `[f32x4(1.0, 1.0, 1.0, 1.0), f32x4(1.0, 1.0, 1.0, 1.0)]`', crates/core_arch/src/arm_shared/neon/generated.rs:34910:9
...
The first two lanes of each result vector actually seem to be uninitialised; in my brief investigation I've seen different values there.
One curiosity is that the the symbol (test_vld2q_dup_f32
) doesn't actually exist in the generated ELF, but I think I found the code using GDB (named core::ops::function::FnOnce::call_once::h81d7cb0a06a6ffb8
). I assume this is a consequence of the optimisation path, or some kind of inlining. In any case, the code does look incorrect, at a glance:
a737c: f4e30daf vld2.32 {d16[],d18[]}, [r3]
a7380: f4a33daf vld2.32 {d3[],d5[]}, [r3]
a7384: eeb42a40 vcmp.f32 s4, s0
...
(then chained comparisons of `s5, s0`, `s6, s0` etc)
That vcmp.f32
compares s4
(which is the lower 32 bits of d2
), but we never loaded that. (It isn't written earlier in the function either.)
A good compilation stores the results back to an array an calls core::array::equality::<impl core::cmp::PartialEq<[B; N]> for [A; N]>::eq
, so perhaps this is an inlining error.
Meta
The first failing nightly is nightly-2023-06-02
(d59363ad0
), but I bisected the problem down to 89bf30e73d97. However, that commit simply enabled an optimisation pass. Manually enabling the pass in older nightlies shows exactly the same fault.
The following reproduces the bug on older nightlies (at least back to nightly-2023-03-25
):
$ export RUSTFLAGS="-D warnings -Z merge-functions=disabled -Ctarget-feature=+neon -Zmir-opt-level=4"
$ cargo test --manifest-path=crates/core_arch/Cargo.toml --target=armv7-unknown-linux-gnueabihf --release test_vld2q_dup_f32
...
I think this is probably a rustc bug, but I have not ruled out a bug in stdarch or some SIMD infrastructure it uses.