Skip to content

Benchmarks

Morwenn edited this page Dec 21, 2015 · 25 revisions

Benchmarking is hard and I might not be doing it right. Moreover, benchmarking sorting algorithms highlights that the time needed to sort a collection of elements depends on several things: the type to sort, the size of collection, the cost of comparing two values, the cost of moving two values, the distribution of the values in the collection to sort, the collection itself, etc... The aim of this page is to help you choose a sorting algorithm depending on your needs. There is no clear structure to this page: we will mostly see different benchmarks and try to analyze them a bit.

All of the graphs on this page have been generated with slightly modified versions of the scripts found in the project's benchmarks folder. There are just too many things to check; if you ever want a specific benchmark, don't hesitate to ask for it.

Random-access iterables

Most sorting algorithms are designed to work with random-access iterators, so this section is deemed to be bigger than the other ones. Note that many of the algorithms work better with contiguous iterators; the resuts for std::vector and std::deque are probably different.

Unstable sort for std::vector with 10⁶ int

As we can see, even though its algorithmic complexity is excellent, smooth_sorter is simply too slow on average, as is heap_sorter has the same problems. std_sorter actually performs worse than expected (the libstdc++ one). The best sorting algorithms for this configuration are pdq_sorter, quick_sorter, spread_sorter and verge_sorter.

spread_sorter is by far the best sorter when it comes to sorting truly random collections and is excellent when the number of values in the collection is low (16 values in the examples). When there are pipe-organ patterns, verge_sorter seems to be the best match. That said, both of them use additional memory; pqd_sorter is the best sorter if a small memory footprint is needed.

Unstable sort for std::vector with 10⁷ int

I removed smooth_sorter and heap_sorter from this benchmark since their performance wasn't better than in the previous and made the graph harder to read. Compared to the previous graph, a few things have changed: for example, while it's still better than anything else for random distrubutions, spread_sorter has bad performance for the Alternating distribution. My guess is that it has to do with the fact that it's a radix-sort kind of algorithm and the fact that the Alternating case has both negative values and the widest range of values among the distributions doesn't help.

Both quick_sorter and std_sorter seem to reach some kind of worst behaviour for some distributions, making them less usable than the other unstable sorters in this test case.

Unstable sort for std::deque with 10⁶ int

The results for std::deque are close to those for std::vector. The most notable difference was that heap_sorter and smooth_sorter were even slower, so I didn't include them for readability. Apparently, random-access has a noticeable cost when the iterators are not contiguous. Otherwise, the conclusions are pretty mmuch the same than with std::vector: when the distribution is truly random, spread_sorter is still the clear winner, while verge_sorter beats distrubtions where big chunks are already sorted. pdq_sorter is still the best match when a small memory footprint is required.

Bidirectional iterables

TODO

Forward iterables

TODO

Clone this wiki locally