Skip to content

Miscompilation of AVX2 code under --release #79865

Closed
@tarcieri

Description

@tarcieri

Apologies for not having a minimal reproduction, but this was an extremely difficult bug to even isolate occurring inside some complicated AVX2 code.

The bug is causing the wrong values to be computed. Whether or not it occurs depends on the following conditions:

  • With target-cpu unset, the bug does NOT occur in debug builds, but DOES occur with --release
  • With target-cpu=haswell, the bug does NOT occur in --release builds and both debug and release builds are OK

I can attempt to further isolate and reduce the problem, but there's a lot of spooky-action-at-a-distance happening making that rather difficult.

For now, here is the best reproduction I can provide:

EDIT: I've deleted the poly1305/avx2-bug branch as there is now a much smaller repro, but so long as GitHub hasn't GC'd it here's the original commit:

RustCrypto/universal-hashes@7485010

git clone https://github.com/RustCrypto/universal-hashes
cd universal-hashes/poly1305
git checkout poly1305/avx2-bug

NOTE: if you git show from here, I've included lots of notes in the latest commit about the bug in the commit message. The commit also contains comments indicating lines you can comment or uncomment to make the tests succeed or fail.

Commands to run which DON'T trigger the bug

  • cargo test donna_self_test1 -- --nocapture
  • RUSTFLAGS="-Ctarget-cpu=haswell" cargo test donna_self_test1 --release -- --nocapture

Commands to run which DO trigger the bug

NOTE: as this is a bug in the AVX2 backend, you'll need to run it on an AVX2-capable host to trigger the bug.

  • cargo test donna_self_test1 --release -- --nocapture

This test fails with a miscomputed result (as do all of the tests across the board if you run the whole suite):

thread 'donna_self_test1' panicked at 'assertion failed: `(left == right)`
  left: `[3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]`,
 right: `[254, 255, 255, 255, 255, 255, 239, 255, 255, 63, 0, 0, 0, 254, 255, 255]`', poly1305/tests/lib.rs:47:5

Things which mysteriously make the tests pass

The aforementioned cargo test ... --release ... will pass if any of the following things which are documented in the 74850109 commit (git show) message and comments introduced in that commit are changed:

  • A commented out dbg! statement near the first observation of the miscompilation is uncommented (heisenbug!)
  • The #[target_feature(enable = "avx2")] attribute on the finalize function is commented out. This function is in a completely different module, hence my descriptions of "spooky action at a distance" (the function in which the bug is occurring is annotated #[inline(always)], but the bug still occurs if that attribute is commented out)

Meta

This bug is easily reproducible and occurs on all versions of the Rust compiler and all operating systems I've tried. I've reproduced it locally on macOS and it also occurred on Linux/Ubuntu via GitHub Actions.

Here are some compiler versions I've tried:

rustc 1.48.0 (7eac88abb 2020-11-16)

Latest nightly as of opening this ticket:

rustc 1.50.0-nightly (1700ca07c 2020-12-08)

It also broke in CI which tests it under the MSRV of 1.41.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-SIMDArea: SIMD (Single Instruction Multiple Data)C-bugCategory: This is a bug.E-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions