Description
I get a huge performance regression since #8328 landed (revision a8c3fe4) on all my projects. Things are 50 to 75% slower. I’m pretty sure #8328 is in cause since when I revert the compiler to the version right before (revision 67c954e) performances go back to normal.
For what it’s worth, the concerned projects are 100% generic, and rely a lot on cross-crate inlining. They do a lot of numeric computations and array indexing. Sorry if I am a bit vague but I cannot valgrind my projects because my valgrind started to segfault a few days ago (perhaps since the re-enabling of jemalloc)…
I tried to come up with a small bench exhibiting the problem. It is not that significative, but the following shows some noticeable performances regression already:
extern mod extra;
use std::hashmap;
use extra::test::BenchHarness;
#[bench]
fn bench_insert_std(bh: &mut BenchHarness) {
let mut m = hashmap::HashMap::with_capacity(32);
do bh.iter {
for i in range(0u, 500) {
m.insert((i, i), i);
}
}
}
#[bench]
fn bench_insert_find_remove_std(bh: &mut BenchHarness) {
let mut m = hashmap::HashMap::with_capacity(32);
do bh.iter {
for i in range(0u, 200) {
m.insert((i, i), i);
}
for i in range(0u, 200) {
assert!(*m.find(&(i, i)).unwrap() == i)
}
for i in range(100u, 200) {
m.remove(&(i, i));
}
for i in range(100u, 200) {
assert!(m.find(&(i, i)).is_none())
}
for i in range(0u, 100) {
m.insert((i, i), i * 2);
}
for i in range(0u, 100) {
assert!(*m.find(&(i, i)).unwrap() == i * 2)
}
for i in range(0u, 100) {
m.remove(&(i, i));
}
for i in range(0u, 100) {
assert!(m.find(&(i, i)).is_none())
}
}
}
fn main() {
}
Compiled with --opt-level=3
.
With the (new) compiler a8c3fe4, I get:
test bench_insert_find_remove_std ... bench: 89242 ns/iter (+/- 3605)
test bench_insert_std ... bench: 46177 ns/iter (+/- 1555)
With the (old) compiler 67c954e, I get something more than 10% faster. The asm dump is smaller too:
test bench_insert_find_remove_std ... bench: 73939 ns/iter (+/- 2872)
test bench_insert_std ... bench: 38482 ns/iter (+/- 1005)