Skip to content

Rollup of 10 pull requests #28476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 70 commits into from
Sep 17, 2015
Merged

Rollup of 10 pull requests #28476

merged 70 commits into from
Sep 17, 2015

Conversation

id4ho and others added 30 commits September 6, 2015 16:23
I'd love to have any tips about highlights and lang stuff I missed. Sadly, this needs to be merged *tomorrow*.

[Rendered](https://github.com/brson/rust/blob/relnotes/RELEASES.md)
This could be a [breaking-change] if your lint or syntax extension (is that even possible?) uses HIR attributes or literals.
This could be a [breaking-change] if your lint or syntax extension (is that even possible?) uses HIR attributes or literals.
Knowing the result of equality comparison can enable additional
optimizations in LLVM.

Additionally, this makes it obvious that `partial_cmp` on totally
ordered types cannot return `None`.
Reusing the same idea as in rust-lang#26884, we can exploit the fact that the
length of slices is known, hence we can use a counted loop instead of
iterators, which means that we only need a single counter, instead of
having to increment and check one pointer for each iterator.

Using the generic implementation of the boolean comparison operators
(`lt`, `le`, `gt`, `ge`) provides further speedup for simple
types. This happens because the loop scans elements checking for
equality and dispatches to element comparison or length comparison
depending on the result of the prefix comparison.

```
test u8_cmp          ... bench:      14,043 ns/iter (+/- 1,732)
test u8_lt           ... bench:      16,156 ns/iter (+/- 1,864)
test u8_partial_cmp  ... bench:      16,250 ns/iter (+/- 2,608)
test u16_cmp         ... bench:      15,764 ns/iter (+/- 1,420)
test u16_lt          ... bench:      19,833 ns/iter (+/- 2,826)
test u16_partial_cmp ... bench:      19,811 ns/iter (+/- 2,240)
test u32_cmp         ... bench:      15,792 ns/iter (+/- 3,409)
test u32_lt          ... bench:      18,577 ns/iter (+/- 2,075)
test u32_partial_cmp ... bench:      18,603 ns/iter (+/- 5,666)
test u64_cmp         ... bench:      16,337 ns/iter (+/- 2,511)
test u64_lt          ... bench:      18,074 ns/iter (+/- 7,914)
test u64_partial_cmp ... bench:      17,909 ns/iter (+/- 1,105)
```

```
test u8_cmp          ... bench:       6,511 ns/iter (+/- 982)
test u8_lt           ... bench:       6,671 ns/iter (+/- 919)
test u8_partial_cmp  ... bench:       7,118 ns/iter (+/- 1,623)
test u16_cmp         ... bench:       6,689 ns/iter (+/- 921)
test u16_lt          ... bench:       6,712 ns/iter (+/- 947)
test u16_partial_cmp ... bench:       6,725 ns/iter (+/- 780)
test u32_cmp         ... bench:       7,704 ns/iter (+/- 1,294)
test u32_lt          ... bench:       7,611 ns/iter (+/- 3,062)
test u32_partial_cmp ... bench:       7,640 ns/iter (+/- 1,149)
test u64_cmp         ... bench:       7,517 ns/iter (+/- 2,164)
test u64_lt          ... bench:       7,579 ns/iter (+/- 1,048)
test u64_partial_cmp ... bench:       7,629 ns/iter (+/- 1,195)
```
Instead of manually defining it, `partial_cmp` can simply wrap the
result of `cmp` for totally ordered types.
In order to get rid of all range checks, the compiler needs to
explicitly see that the slices it iterates over are as long as the
loop variable upper bound.

This further improves the performance of slice comparison:

```
test u8_cmp          ... bench:       4,761 ns/iter (+/- 1,203)
test u8_lt           ... bench:       4,579 ns/iter (+/- 649)
test u8_partial_cmp  ... bench:       4,768 ns/iter (+/- 761)
test u16_cmp         ... bench:       4,607 ns/iter (+/- 580)
test u16_lt          ... bench:       4,681 ns/iter (+/- 567)
test u16_partial_cmp ... bench:       4,607 ns/iter (+/- 967)
test u32_cmp         ... bench:       4,448 ns/iter (+/- 891)
test u32_lt          ... bench:       4,546 ns/iter (+/- 992)
test u32_partial_cmp ... bench:       4,415 ns/iter (+/- 646)
test u64_cmp         ... bench:       4,380 ns/iter (+/- 1,184)
test u64_lt          ... bench:       5,684 ns/iter (+/- 602)
test u64_partial_cmp ... bench:       4,663 ns/iter (+/- 1,158)
```
Be more conservative with inlining.
bors and others added 21 commits September 17, 2015 10:11
Currently, we're generating adjustments, for example, to get from &[u8]
to &[u8], which is unneeded and kicks us out of trans_into() into
trans() which means an additional stack slot and copy in the unoptimized
code.
Currently, we're generating adjustments, for example, to get from &[u8]
to &[u8], which is unneeded and kicks us out of trans_into()
into trans() which means an additional stack slot and copy in the
unoptimized code.
…huonw

Type `HANDLE` is defined on Windows as `PVOID`. Test `run-pass/x86stdcall2` defined it as `u32` that caused access violation in `catch_panic` routine at the line:

```
try!(unwind::try(move || *result = Some(f())))
```

The original failure is as follows:

```
---- [run-pass] run-pass/x86stdcall2.rs stdout ----

error: test run failed!
status: exit code: -1073741819
command: PATH="x86_64-pc-windows-msvc/stage2/bin/rustlib/x86_64-pc-windows-msvc/lib;D:\Sources\Rust\x86_64-pc-windows-msvc\stage2\bin;C:\MSYS2\mingw64\bin;C:\MSYS2\usr\local\bin;C:\MSYS2\usr\bin;C:\MSYS2\usr\bin;C:\Program Files\Python 3;C:\Program Files\Python 3\Scripts;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0;C:\Program Files (x86)\Windows Kits\8.1\Windows Performance Toolkit;C:\Program Files\SlikSvn\bin;C:\Program Files\System Tools;C:\Program Files (x86)\System Tools;C:\Program Files\Vim\vim74;C:\Program Files\Rust\bin;C:\Program Files\Microsoft\Web Platform Installer;C:\Program Files\MiKTeX\miktex\bin\x64;C:\Program Files (x86)\Pandoc;C:\Program Files\LLVM\bin;C:\Program Files\KDiff3;C:\Program Files\Git\cmd;C:\Users\Vitali\AppData\Local\atom\bin;C:\MSYS2\usr\bin\site_perl;C:\MSYS2\usr\bin\vendor_perl;C:\MSYS2\usr\bin\core_perl" x86_64-pc-windows-msvc/test/run-pass\x86stdcall2.stage2-x86_64-pc-windows-msvc.exe
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

------------------------------------------

thread '[run-pass] run-pass/x86stdcall2.rs' panicked at 'explicit panic', D:/Sources/Rust/src/compiletest\runtest.rs:1501
```

P.S. I compiled rust for `x86_64-pc-windows-msvc`.
It's clear it's the one being documented
Using "later" in this context makes more sense than "greater" so it's been changed to match the Linux requirement above it rather than the other way around.
@steveklabnik
Copy link
Member Author

@bors: r+ p=1

@bors
Copy link
Collaborator

bors commented Sep 17, 2015

📌 Commit 5faff5d has been approved by steveklabnik

@rust-highfive
Copy link
Contributor

r? @huonw

(rust_highfive has picked a reviewer for you, use r? to override)

@bors
Copy link
Collaborator

bors commented Sep 17, 2015

⌛ Testing commit 5faff5d with merge cff0411...

bors added a commit that referenced this pull request Sep 17, 2015
@bors bors merged commit 5faff5d into rust-lang:master Sep 17, 2015
@Centril Centril added the rollup A PR which is a rollup label Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rollup A PR which is a rollup
Projects
None yet
Development

Successfully merging this pull request may close these issues.