Skip to content

Only convert chars when needed in str case conversion methods. #52061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

Pazzaz
Copy link
Contributor

@Pazzaz Pazzaz commented Jul 4, 2018

Optimizes str.to_lowercase() and str.to_uppercase() by appending slices of correct characters to the new string instead of inserting them individually.

I benched these changes (file) with these simple test cases:


ALL_LOWER:       "loremipsumdolorsitametduosensibusmnesarchumabcdefgh"
ALL_UPPER:       "LOREMIPSUMDOLORSITAMETDUOSENSIBUSMNESARCHUMABCDEFGH"
REALISTIC_UPPER: "LOREM IPSUM DOLOR SIT AMET, DUO SENSIBUS MNESARCHUM"
SIGMAS:          "ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣ ΣΣΣΣ, ΣΣΣ ΣΣΣΣΣΣΣΣ ΣΣΣΣΣΣΣΣΣΣ" // to_lowercase() special case
WORD_UPPER:      "Lorem Ipsum Dolor Sit Amet, Duo Sensibus Mnesarchum";

These were the results of the change to to_uppercase():

running 10 tests
test tests::all_lower_uppercase           ... bench:       1,707 ns/iter (+/- 24)
test tests::all_lower_uppercase_new       ... bench:       1,753 ns/iter (+/- 35)
test tests::all_upper_uppercase           ... bench:       1,631 ns/iter (+/- 26)
test tests::all_upper_uppercase_new       ... bench:         145 ns/iter (+/- 10)
test tests::realistic_upper_uppercase     ... bench:       1,629 ns/iter (+/- 34)
test tests::realistic_upper_uppercase_new ... bench:         145 ns/iter (+/- 1)
test tests::sigmas_uppercase              ... bench:       1,876 ns/iter (+/- 57)
test tests::sigmas_uppercase_new          ... bench:         322 ns/iter (+/- 8)
test tests::word_upper_uppercase          ... bench:       1,868 ns/iter (+/- 34)
test tests::word_upper_uppercase_new      ... bench:       1,332 ns/iter (+/- 34)

And these were the results of the change to to_lowercase():

running 10 tests
test tests::all_lower_lowercase           ... bench:       1,591 ns/iter (+/- 53)
test tests::all_lower_lowercase_new       ... bench:         143 ns/iter (+/- 4)
test tests::all_upper_lowercase           ... bench:       1,688 ns/iter (+/- 18)
test tests::all_upper_lowercase_new       ... bench:       1,760 ns/iter (+/- 34)
test tests::realistic_upper_lowercase     ... bench:       1,858 ns/iter (+/- 45)
test tests::realistic_upper_lowercase_new ... bench:       1,585 ns/iter (+/- 22)
test tests::sigmas_lowercase              ... bench:       1,196 ns/iter (+/- 33)
test tests::sigmas_lowercase_new          ... bench:       1,095 ns/iter (+/- 45)
test tests::word_upper_lowercase          ... bench:       1,756 ns/iter (+/- 49)
test tests::word_upper_lowercase_new      ... bench:         519 ns/iter (+/- 21)

So an improvement across the board, except for, all_upper_lowercase and all_lower_uppercase, where every single character had to be converted. I think that's a fair trade considering the gain in performance is pretty large in many other cases.

@rust-highfive
Copy link
Contributor

r? @cramertj

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 4, 2018
@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:03:35]    Compiling panic_abort v0.0.0 (file:///checkout/src/libpanic_abort)
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:371:47
[00:03:36]     |
[00:03:36] 371 |                         s.push_str((lower..i).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:390:44
[00:03:36]     |
[00:03:36]     |
[00:03:36] 390 |             s.push_str((lower..self.len()).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:451:47
[00:03:36]     |
[00:03:36]     |
[00:03:36] 451 |                         s.push_str((lower..i).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:461:44
[00:03:36]     |
[00:03:36]     |
[00:03:36] 461 |             s.push_str((lower..self.len()).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:37] error: aborting due to 4 previous errors
[00:03:37] 
[00:03:37] For more information about this error, try `rustc --explain E0658`.
[00:03:37] error: Could not compile `alloc`.
[00:03:37] error: Could not compile `alloc`.
[00:03:37] 
[00:03:37] Caused by:
[00:03:37]   process didn't exit successfully: `/checkout/obj/build/bootstrap/debug/rustc --crate-name alloc liballoc/lib.rs --color always --error-format json --crate-type lib --emit=dep-info,link -C opt-level=2 -C metadata=dab08365760adda8 -C extra-filename=-dab08365760adda8 --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps --target x86_64-unknown-linux-gnu -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/release/deps --extern compiler_builtins=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/libcompiler_builtins-8c383f2d2bbbcb1d.rlib --extern core=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/libcore-c8f7a210a5d40dd6.rlib -L native=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/build/compiler_builtins-c2bb2884be27c6a0/out` (exit code: 101)
[00:03:37] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "--message-format" "json"
[00:03:37] expected success, got: exit code: 101
[00:03:37] thread 'main' panicked at 'cargo must succeed', bootstrap/compile.rs:1114:9
[00:03:37] travis_fold:end:stage0-std

[00:03:37] travis_time:end:stage0-std:start=1530734845838641966,finish=1530734877917793046,duration=32079151080


[00:03:37] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:03:37] Build completed unsuccessfully in 0:00:33
[00:03:37] Makefile:79: recipe for target 'tidy' failed
[00:03:37] make: *** [tidy] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:2c69bc28
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
---
travis_time:end:103b1194:start=1530734878481067013,finish=1530734878490020612,duration=8953599
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:138573e4
$ head -30 ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
head: cannot open ‘./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers’ for reading: No such file or directory
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:21845564
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[00:04:27] travis_fold:start:tidy
travis_time:start:tidy
tidy check
[00:04:27] tidy error: /checkout/src/liballoc/str.rs:384: trailing whitespace
[00:04:29] some tidy checks failed
[00:04:29] 
[00:04:29] 
[00:04:29] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:04:29] 
[00:04:29] 
[00:04:29] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:04:29] Build completed unsuccessfully in 0:01:37
[00:04:29] Build completed unsuccessfully in 0:01:37
[00:04:29] make: *** [tidy] Error 1
[00:04:29] Makefile:79: recipe for target 'tidy' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:2da68b40
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
---
travis_time:end:071d5a86:start=1530735627083918356,finish=1530735627092132728,duration=8214372
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:18913710
$ head -30 ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
head: cannot open ‘./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers’ for reading: No such file or directory
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:05591e62
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[01:07:17] ....................................................................................................
[01:07:17] thread '<unnamed>' panicked at 'explicit panic', liballoc/../liballoc/tests/slice.rs:1315:17
[01:07:17] ....................................................................................................
[01:07:17] ....................................................................................................
[01:07:17] .....................................................................FF.............................
[01:07:20] ..................................
[01:07:20] failures:
[01:07:20] 
[01:07:20] ---- str::to_lowercase stdout ----
[01:07:20] ---- str::to_lowercase stdout ----
[01:07:20] thread 'str::to_lowercase' panicked at 'assertion failed: `(left == right)`
[01:07:20]   left: `"aéDžaé "`,
[01:07:20]  right: `"aédžaé "`', liballoc/../liballoc/tests/str.rs:1534:5
[01:07:20] ---- str::to_uppercase stdout ----
[01:07:20] thread 'str::to_uppercase' panicked at 'assertion failed: `(left == right)`
[01:07:20] thread 'str::to_uppercase' panicked at 'assertion failed: `(left == right)`
[01:07:20]   left: `"AÉDžSSFIἈΙ"`,
[01:07:20]  right: `"AÉDŽSSFIἈΙ"`', liballoc/../liballoc/tests/str.rs:1567:5
[01:07:20] 
[01:07:20] failures:
[01:07:20]     str::to_lowercase
[01:07:20]     str::to_uppercase
[01:07:20]     str::to_uppercase
[01:07:20] 
[01:07:20] test result: FAILED. 532 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out
[01:07:20] 
[01:07:20] error: test failed, to rerun pass '--test collectionstests'
[01:07:20] 
[01:07:20] 
[01:07:20] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "alloc" "--" "--quiet"
[01:07:20] 
[01:07:20] 
[01:07:20] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:07:20] Build completed unsuccessfully in 0:25:18
[01:07:20] Build completed unsuccessfully in 0:25:18
[01:07:20] Makefile:58: recipe for target 'check' failed
[01:07:20] make: *** [check] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:314f20b4
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Pazzaz
Copy link
Contributor Author

Pazzaz commented Jul 4, 2018

Some of my assumptions seem to have been wrong. I really should have run these tests locally first 😓
EDIT: I'll close this for now, may come back to it later.

@Pazzaz Pazzaz closed this Jul 5, 2018
kennytm added a commit to kennytm/rust that referenced this pull request Jul 18, 2018
Handle array manually in str case conversion methods

Avoiding the overhead incurred from `String.extend(char.to_lowercase())` showed a notable performance improvement when I benchmarked it.

I tested on these strings:
```rust
ALL_LOWER:       "loremipsumdolorsitametduosensibusmnesarchumabcdefgh"
ALL_UPPER:       "LOREMIPSUMDOLORSITAMETDUOSENSIBUSMNESARCHUMABCDEFGH"
REALISTIC_UPPER: "LOREM IPSUM DOLOR SIT AMET, DUO SENSIBUS MNESARCHUM"
SIGMAS:          "ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣ ΣΣΣΣ, ΣΣΣ ΣΣΣΣΣΣΣΣ ΣΣΣΣΣΣΣΣΣΣ"
WORD_UPPER:      "Lorem Ipsum Dolor Sit Amet, Duo Sensibus Mnesarchum"
```
the performance improvements of `to_lowercase()` were
```
running 10 tests
test tests::all_lower           ... bench:       1,752 ns/iter (+/- 49)
test tests::all_lower_new       ... bench:       1,266 ns/iter (+/- 15)   -28%
test tests::all_upper           ... bench:       1,832 ns/iter (+/- 39)
test tests::all_upper_new       ... bench:       1,337 ns/iter (+/- 18)   -27%
test tests::realistic_upper     ... bench:       1,993 ns/iter (+/- 14)
test tests::realistic_upper_new ... bench:       1,445 ns/iter (+/- 22)   -27%
test tests::sigmas              ... bench:       1,342 ns/iter (+/- 39)
test tests::sigmas_new          ... bench:       1,226 ns/iter (+/- 16)    -9%
test tests::word_upper          ... bench:       1,899 ns/iter (+/- 12)
test tests::word_upper_new      ... bench:       1,381 ns/iter (+/- 26)   -27%
```
and of `to_uppercase()`
```
running 10 tests
test tests::all_lower           ... bench:       1,813 ns/iter (+/- 20)
test tests::all_lower_new       ... bench:       1,321 ns/iter (+/- 16)   -27%
test tests::all_upper           ... bench:       1,629 ns/iter (+/- 22)
test tests::all_upper_new       ... bench:       1,241 ns/iter (+/- 9)    -24%
test tests::realistic_upper     ... bench:       1,670 ns/iter (+/- 24)
test tests::realistic_upper_new ... bench:       1,241 ns/iter (+/- 17)   -26%
test tests::sigmas              ... bench:       2,053 ns/iter (+/- 20)
test tests::sigmas_new          ... bench:       1,753 ns/iter (+/- 23)   -15%
test tests::word_upper          ... bench:       1,873 ns/iter (+/- 30)
test tests::word_upper_new      ... bench:       1,412 ns/iter (+/- 25)   -25%
```
I gave up on the more advanced method from rust-lang#52061 as it wasn't always a clear improvement and would help in even less cases if this PR was merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants