Skip to content

Performance of BinaryHeap::into_sorted_vec vs. Vec::sort_unstable #115357

Open
@ghost

Description

I think the implementation of BinaryHeap::into_sorted_vec should just use Vec::sort_unstable instead of sorting by repeated Heap operations. For very small N (N < 1000), the current implementation is maybe 5% faster some times, but for bigger N Vec::sort_unstable will be more than 10 to 30 times faster.

Also, I assume sorting by Heap operations requires 2*N*Log2(N) key comparisons in average and worst case when using the default Heapsort like in the standard library, while Vec::sort_unstable probably requires less than 1.5*N*Log2(N) key comparisons in average case and also a lower number of moves in average case.

I don't have good statistics, maybe others could add some.

Here is an example, to show the performance difference:

/*
[dependencies]
fastrand = "2.0.0"
*/

use std::{collections::BinaryHeap, time::Instant};

fn main() {
    let N: usize = 10_000_000;
    let mut a_elements_in_random_order: Vec<usize> = Vec::from_iter(0..N);
    fastrand::shuffle(&mut a_elements_in_random_order);
    let mut heap: BinaryHeap<usize> = BinaryHeap::with_capacity(N);
    for i in 0..N {
        heap.push(a_elements_in_random_order[i]);
    }
    // both Vecs are now identical
    let mut a_sorted_by_sort_unstable = heap.clone().into_vec();
    let now = Instant::now();
    a_sorted_by_sort_unstable.sort_unstable();
    let runtime_sort_unstable = now.elapsed();
    let now = Instant::now();
    let a_sorted_by_heap = heap.into_sorted_vec();
    let runtime_sorted_by_heap = now.elapsed();
    assert!(a_sorted_by_sort_unstable == a_sorted_by_heap);
    println!("runtime_sort_unstable={:?}", runtime_sort_unstable);
    println!("runtime_sorted_by_heap={:?}", runtime_sorted_by_heap);
}
N=10_000_000
runtime_sort_unstable=403.8169ms
runtime_sorted_by_heap=4.6621243s
N=100_000_000
runtime_sort_unstable=3.6869678s
runtime_sorted_by_heap=95.9239879s

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions