Skip to content

bench: Improve the spectralnorm shootout benchmark #17989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 17, 2014

Conversation

alexcrichton
Copy link
Member

This improves the spectralnorm shootout benchmark through a few vectors after
looking at the leading C implementation:

  • The simd-based f64x2 is now used to parallelize a few computations
  • RWLock usage has been removed. A custom parallel function was added as a
    form of stack-based fork-join parallelism. I found that the contention on the
    locks was high as well as hindering other optimizations.

This does, however, introduce one unsafe block into the benchmarks, which
previously had none.

In terms of timings, the before and after numbers are:

$ time ./shootout-spectralnorm-before
./shootout-spectralnorm-before  2.07s user 0.71s system 324% cpu 0.857 total
$ time ./shootout-spectralnorm-before 5500
./shootout-spectralnorm-before 5500  11.88s user 1.13s system 459% cpu 2.830 total
$ time ./shootout-spectralnorm-after
./shootout-spectralnorm-after  0.58s user 0.01s system 280% cpu 0.210 tota
$ time ./shootout-spectralnorm-after 5500
./shootout-spectralnorm-after 5500  3.55s user 0.01s system 455% cpu 0.783 total

@rust-highfive
Copy link
Contributor

warning Warning warning

  • These commits modify unsafe code. Please review it carefully!

// Executes a closure in parallel over the given mutable slice. The closure `f`
// is run in parallel and yielded the starting index within `v` as well as a
// sub-slice of `v`.
fn parallel<T: Send + Sync>(v: &mut [T], mut f: |uint, &mut [T]|: Sync) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this function is safe in general, because f could mutate something it captured. However, it should be possible to encode the correct bound with unboxed closures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I experimented with a similar rewrite that used unboxed closures, and they worked well enough, i.e. no ICEs etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear, you're right! I thought the Sync found was enough, but I think this needs to take an instance of Fn to ensure there's no &mut captures. I've updated with the unboxed closure equivalent of Fn, thanks!

This improves the spectralnorm shootout benchmark through a few vectors after
looking at the leading C implementation:

* The simd-based f64x2 is now used to parallelize a few computations
* RWLock usage has been removed. A custom `parallel` function was added as a
  form of stack-based fork-join parallelism. I found that the contention on the
  locks was high as well as hindering other optimizations.

This does, however, introduce one `unsafe` block into the benchmarks, which
previously had none.

In terms of timings, the before and after numbers are:

```
$ time ./shootout-spectralnorm-before
./shootout-spectralnorm-before  2.07s user 0.71s system 324% cpu 0.857 total
$ time ./shootout-spectralnorm-before 5500
./shootout-spectralnorm-before 5500  11.88s user 1.13s system 459% cpu 2.830 total
$ time ./shootout-spectralnorm-after
./shootout-spectralnorm-after  0.58s user 0.01s system 280% cpu 0.210 tota
$ time ./shootout-spectralnorm-after 5500
./shootout-spectralnorm-after 5500  3.55s user 0.01s system 455% cpu 0.783 total
```
@TeXitoi
Copy link
Contributor

TeXitoi commented Oct 16, 2014

Great! I'd r+ if I can ;)

@kud1ing
Copy link

kud1ing commented Oct 16, 2014

See also #18085

bors added a commit that referenced this pull request Oct 16, 2014
This improves the spectralnorm shootout benchmark through a few vectors after
looking at the leading C implementation:

* The simd-based f64x2 is now used to parallelize a few computations
* RWLock usage has been removed. A custom `parallel` function was added as a
  form of stack-based fork-join parallelism. I found that the contention on the
  locks was high as well as hindering other optimizations.

This does, however, introduce one `unsafe` block into the benchmarks, which
previously had none.

In terms of timings, the before and after numbers are:

```
$ time ./shootout-spectralnorm-before
./shootout-spectralnorm-before  2.07s user 0.71s system 324% cpu 0.857 total
$ time ./shootout-spectralnorm-before 5500
./shootout-spectralnorm-before 5500  11.88s user 1.13s system 459% cpu 2.830 total
$ time ./shootout-spectralnorm-after
./shootout-spectralnorm-after  0.58s user 0.01s system 280% cpu 0.210 tota
$ time ./shootout-spectralnorm-after 5500
./shootout-spectralnorm-after 5500  3.55s user 0.01s system 455% cpu 0.783 total
```
@bors bors closed this Oct 17, 2014
@bors bors merged commit f7b5470 into rust-lang:master Oct 17, 2014
@alexcrichton alexcrichton deleted the spectralnorm branch November 20, 2014 03:19
lnicola pushed a commit to lnicola/rust that referenced this pull request Sep 25, 2024
…ykril

Provide an option to hide deprecated items from completion

Fixes rust-lang#17989.

I wonder if this should be instead done in the editor, that will do it in a language-agnostic way. Can't hurt to do it in rust-analyzer, I guess.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants