Skip to content

Noticeable performance regression since the last LLVM update. #8665

Closed
@sebcrozet

Description

@sebcrozet

I get a huge performance regression since #8328 landed (revision a8c3fe4) on all my projects. Things are 50 to 75% slower. I’m pretty sure #8328 is in cause since when I revert the compiler to the version right before (revision 67c954e) performances go back to normal.

For what it’s worth, the concerned projects are 100% generic, and rely a lot on cross-crate inlining. They do a lot of numeric computations and array indexing. Sorry if I am a bit vague but I cannot valgrind my projects because my valgrind started to segfault a few days ago (perhaps since the re-enabling of jemalloc)…

I tried to come up with a small bench exhibiting the problem. It is not that significative, but the following shows some noticeable performances regression already:

extern mod extra;

use std::hashmap;
use extra::test::BenchHarness;

#[bench]
fn bench_insert_std(bh: &mut BenchHarness) {
    let mut m = hashmap::HashMap::with_capacity(32);

    do bh.iter {
        for i in range(0u, 500) {
            m.insert((i, i), i);
        }
    }
}

#[bench]
fn bench_insert_find_remove_std(bh: &mut BenchHarness) {
    let mut m = hashmap::HashMap::with_capacity(32);

    do bh.iter {
        for i in range(0u, 200) {
            m.insert((i, i), i);
        }

        for i in range(0u, 200) {
            assert!(*m.find(&(i, i)).unwrap() == i)
        }

        for i in range(100u, 200) {
            m.remove(&(i, i));
        }

        for i in range(100u, 200) {
            assert!(m.find(&(i, i)).is_none())
        }

        for i in range(0u, 100) {
            m.insert((i, i), i * 2);
        }

        for i in range(0u, 100) {
            assert!(*m.find(&(i, i)).unwrap() == i * 2)
        }

        for i in range(0u, 100) {
            m.remove(&(i, i));
        }

        for i in range(0u, 100) {
            assert!(m.find(&(i, i)).is_none())
        }
    }
}

fn main() {
} 

Compiled with --opt-level=3.
With the (new) compiler a8c3fe4, I get:

test bench_insert_find_remove_std ... bench: 89242 ns/iter (+/- 3605)
test bench_insert_std ... bench: 46177 ns/iter (+/- 1555)

With the (old) compiler 67c954e, I get something more than 10% faster. The asm dump is smaller too:

test bench_insert_find_remove_std ... bench: 73939 ns/iter (+/- 2872)
test bench_insert_std ... bench: 38482 ns/iter (+/- 1005)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions