Description
I've observed a fairly nasty performance regression when switching to 1.82 related to compression performance that appears to boil down to some very weird/suboptimal machine code being generated.
Delving into things, it looks like what was previously reasonable generated code is now turning into... well, something way less reasonable, and this results in executing a vastly higher number of instructions for the same task, in turn slowing everything down. The MVCE (mentioned below) demonstrates this behavior across at least two instruction sets (x86-64 and ARM64) and a number of different microarchitectures (AMD Zen2, Zen4, and Apple M2) and reliably shows a performance regression.
Some other folks who were helping me debug this will also be posting some more information, but the biggest initial suspect looks to be the upgrade to LLVM 19 in 1.82.
Code
The MVCE is here: https://github.com/tobz/miniz-oxide-slowdown-repro.
It's a simple program that generates a deterministic input corpus, compresses it multiple times (using flate2
/miniz_oxide
), and then exits. Efforts were made to try and ensure things are deterministic and mostly representative of the observed change between Rust 1.81 and 1.82.
On my environment, this all boils down to each compress operation taking ~750ms when compiled on 1.81.0 or earlier, jumping to ~1.25s per operation on 1.82.0 and later, representing a ~60% slowdown.
Version it worked on
This "worked" (ran well) on 1.81.0, 1.80.0, and 1.79.0 (as far back as I checked).
Version with regression
rustc --version --verbose
:
rustc 1.82.0 (f6e511eec 2024-10-15)
binary: rustc
commit-hash: f6e511eec7342f59a25f7c0534f1dbea00d01b14
commit-date: 2024-10-15
host: x86_64-unknown-linux-gnu
release: 1.82.0
LLVM version: 19.1.1