Only convert chars when needed in str case conversion methods. #52061

Pazzaz · 2018-07-04T20:02:36Z

Optimizes str.to_lowercase() and str.to_uppercase() by appending slices of correct characters to the new string instead of inserting them individually.

I benched these changes (file) with these simple test cases:


ALL_LOWER:       "loremipsumdolorsitametduosensibusmnesarchumabcdefgh"
ALL_UPPER:       "LOREMIPSUMDOLORSITAMETDUOSENSIBUSMNESARCHUMABCDEFGH"
REALISTIC_UPPER: "LOREM IPSUM DOLOR SIT AMET, DUO SENSIBUS MNESARCHUM"
SIGMAS:          "ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣ ΣΣΣΣ, ΣΣΣ ΣΣΣΣΣΣΣΣ ΣΣΣΣΣΣΣΣΣΣ" // to_lowercase() special case
WORD_UPPER:      "Lorem Ipsum Dolor Sit Amet, Duo Sensibus Mnesarchum";

These were the results of the change to to_uppercase():

running 10 tests
test tests::all_lower_uppercase           ... bench:       1,707 ns/iter (+/- 24)
test tests::all_lower_uppercase_new       ... bench:       1,753 ns/iter (+/- 35)
test tests::all_upper_uppercase           ... bench:       1,631 ns/iter (+/- 26)
test tests::all_upper_uppercase_new       ... bench:         145 ns/iter (+/- 10)
test tests::realistic_upper_uppercase     ... bench:       1,629 ns/iter (+/- 34)
test tests::realistic_upper_uppercase_new ... bench:         145 ns/iter (+/- 1)
test tests::sigmas_uppercase              ... bench:       1,876 ns/iter (+/- 57)
test tests::sigmas_uppercase_new          ... bench:         322 ns/iter (+/- 8)
test tests::word_upper_uppercase          ... bench:       1,868 ns/iter (+/- 34)
test tests::word_upper_uppercase_new      ... bench:       1,332 ns/iter (+/- 34)

And these were the results of the change to to_lowercase():

running 10 tests
test tests::all_lower_lowercase           ... bench:       1,591 ns/iter (+/- 53)
test tests::all_lower_lowercase_new       ... bench:         143 ns/iter (+/- 4)
test tests::all_upper_lowercase           ... bench:       1,688 ns/iter (+/- 18)
test tests::all_upper_lowercase_new       ... bench:       1,760 ns/iter (+/- 34)
test tests::realistic_upper_lowercase     ... bench:       1,858 ns/iter (+/- 45)
test tests::realistic_upper_lowercase_new ... bench:       1,585 ns/iter (+/- 22)
test tests::sigmas_lowercase              ... bench:       1,196 ns/iter (+/- 33)
test tests::sigmas_lowercase_new          ... bench:       1,095 ns/iter (+/- 45)
test tests::word_upper_lowercase          ... bench:       1,756 ns/iter (+/- 49)
test tests::word_upper_lowercase_new      ... bench:         519 ns/iter (+/- 21)

So an improvement across the board, except for, all_upper_lowercase and all_lower_uppercase, where every single character had to be converted. I think that's a fair trade considering the gain in performance is pretty large in many other cases.

rust-highfive · 2018-07-04T20:02:39Z

r? @cramertj

(rust_highfive has picked a reviewer for you, use r? to override)

rust-highfive · 2018-07-04T20:08:36Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[00:03:35]    Compiling panic_abort v0.0.0 (file:///checkout/src/libpanic_abort)
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:371:47
[00:03:36]     |
[00:03:36] 371 |                         s.push_str((lower..i).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:390:44
[00:03:36]     |
[00:03:36]     |
[00:03:36] 390 |             s.push_str((lower..self.len()).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:451:47
[00:03:36]     |
[00:03:36]     |
[00:03:36] 451 |                         s.push_str((lower..i).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:36] error[E0658]: use of unstable library feature 'slice_index_methods'
[00:03:36]    --> liballoc/str.rs:461:44
[00:03:36]     |
[00:03:36]     |
[00:03:36] 461 |             s.push_str((lower..self.len()).get_unchecked(self));
[00:03:36]     |
[00:03:36]     |
[00:03:36]     = help: add #![feature(slice_index_methods)] to the crate attributes to enable
[00:03:37] error: aborting due to 4 previous errors
[00:03:37] 
[00:03:37] For more information about this error, try `rustc --explain E0658`.
[00:03:37] error: Could not compile `alloc`.
[00:03:37] error: Could not compile `alloc`.
[00:03:37] 
[00:03:37] Caused by:
[00:03:37]   process didn't exit successfully: `/checkout/obj/build/bootstrap/debug/rustc --crate-name alloc liballoc/lib.rs --color always --error-format json --crate-type lib --emit=dep-info,link -C opt-level=2 -C metadata=dab08365760adda8 -C extra-filename=-dab08365760adda8 --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps --target x86_64-unknown-linux-gnu -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/release/deps --extern compiler_builtins=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/libcompiler_builtins-8c383f2d2bbbcb1d.rlib --extern core=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/libcore-c8f7a210a5d40dd6.rlib -L native=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/build/compiler_builtins-c2bb2884be27c6a0/out` (exit code: 101)
[00:03:37] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "--message-format" "json"
[00:03:37] expected success, got: exit code: 101
[00:03:37] thread 'main' panicked at 'cargo must succeed', bootstrap/compile.rs:1114:9
[00:03:37] travis_fold:end:stage0-std

[00:03:37] travis_time:end:stage0-std:start=1530734845838641966,finish=1530734877917793046,duration=32079151080


[00:03:37] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:03:37] Build completed unsuccessfully in 0:00:33
[00:03:37] Makefile:79: recipe for target 'tidy' failed
[00:03:37] make: *** [tidy] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:2c69bc28
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
---
travis_time:end:103b1194:start=1530734878481067013,finish=1530734878490020612,duration=8953599
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:138573e4
$ head -30 ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
head: cannot open ‘./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers’ for reading: No such file or directory
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:21845564
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

rust-highfive · 2018-07-04T20:28:44Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.


[00:04:27] travis_fold:start:tidy
travis_time:start:tidy
tidy check
[00:04:27] tidy error: /checkout/src/liballoc/str.rs:384: trailing whitespace
[00:04:29] some tidy checks failed
[00:04:29] 
[00:04:29] 
[00:04:29] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:04:29] 
[00:04:29] 
[00:04:29] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:04:29] Build completed unsuccessfully in 0:01:37
[00:04:29] Build completed unsuccessfully in 0:01:37
[00:04:29] make: *** [tidy] Error 1
[00:04:29] Makefile:79: recipe for target 'tidy' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:2da68b40
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
---
travis_time:end:071d5a86:start=1530735627083918356,finish=1530735627092132728,duration=8214372
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:18913710
$ head -30 ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
head: cannot open ‘./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers’ for reading: No such file or directory
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:05591e62
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

rust-highfive · 2018-07-04T21:39:15Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[01:07:17] ....................................................................................................
[01:07:17] thread '<unnamed>' panicked at 'explicit panic', liballoc/../liballoc/tests/slice.rs:1315:17
[01:07:17] ....................................................................................................
[01:07:17] ....................................................................................................
[01:07:17] .....................................................................FF.............................
[01:07:20] ..................................
[01:07:20] failures:
[01:07:20] 
[01:07:20] ---- str::to_lowercase stdout ----
[01:07:20] ---- str::to_lowercase stdout ----
[01:07:20] thread 'str::to_lowercase' panicked at 'assertion failed: `(left == right)`
[01:07:20]   left: `"aéǅaé "`,
[01:07:20]  right: `"aéǆaé "`', liballoc/../liballoc/tests/str.rs:1534:5
[01:07:20] ---- str::to_uppercase stdout ----
[01:07:20] thread 'str::to_uppercase' panicked at 'assertion failed: `(left == right)`
[01:07:20] thread 'str::to_uppercase' panicked at 'assertion failed: `(left == right)`
[01:07:20]   left: `"AÉǅSSFIἈΙ"`,
[01:07:20]  right: `"AÉǄSSFIἈΙ"`', liballoc/../liballoc/tests/str.rs:1567:5
[01:07:20] 
[01:07:20] failures:
[01:07:20]     str::to_lowercase
[01:07:20]     str::to_uppercase
[01:07:20]     str::to_uppercase
[01:07:20] 
[01:07:20] test result: FAILED. 532 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out
[01:07:20] 
[01:07:20] error: test failed, to rerun pass '--test collectionstests'
[01:07:20] 
[01:07:20] 
[01:07:20] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "alloc" "--" "--quiet"
[01:07:20] 
[01:07:20] 
[01:07:20] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:07:20] Build completed unsuccessfully in 0:25:18
[01:07:20] Build completed unsuccessfully in 0:25:18
[01:07:20] Makefile:58: recipe for target 'check' failed
[01:07:20] make: *** [check] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:314f20b4
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Pazzaz · 2018-07-04T21:53:06Z

Some of my assumptions seem to have been wrong. I really should have run these tests locally first 😓
EDIT: I'll close this for now, may come back to it later.

Handle array manually in str case conversion methods Avoiding the overhead incurred from `String.extend(char.to_lowercase())` showed a notable performance improvement when I benchmarked it. I tested on these strings: ```rust ALL_LOWER: "loremipsumdolorsitametduosensibusmnesarchumabcdefgh" ALL_UPPER: "LOREMIPSUMDOLORSITAMETDUOSENSIBUSMNESARCHUMABCDEFGH" REALISTIC_UPPER: "LOREM IPSUM DOLOR SIT AMET, DUO SENSIBUS MNESARCHUM" SIGMAS: "ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣΣΣ ΣΣΣ ΣΣΣΣ, ΣΣΣ ΣΣΣΣΣΣΣΣ ΣΣΣΣΣΣΣΣΣΣ" WORD_UPPER: "Lorem Ipsum Dolor Sit Amet, Duo Sensibus Mnesarchum" ``` the performance improvements of `to_lowercase()` were ``` running 10 tests test tests::all_lower ... bench: 1,752 ns/iter (+/- 49) test tests::all_lower_new ... bench: 1,266 ns/iter (+/- 15) -28% test tests::all_upper ... bench: 1,832 ns/iter (+/- 39) test tests::all_upper_new ... bench: 1,337 ns/iter (+/- 18) -27% test tests::realistic_upper ... bench: 1,993 ns/iter (+/- 14) test tests::realistic_upper_new ... bench: 1,445 ns/iter (+/- 22) -27% test tests::sigmas ... bench: 1,342 ns/iter (+/- 39) test tests::sigmas_new ... bench: 1,226 ns/iter (+/- 16) -9% test tests::word_upper ... bench: 1,899 ns/iter (+/- 12) test tests::word_upper_new ... bench: 1,381 ns/iter (+/- 26) -27% ``` and of `to_uppercase()` ``` running 10 tests test tests::all_lower ... bench: 1,813 ns/iter (+/- 20) test tests::all_lower_new ... bench: 1,321 ns/iter (+/- 16) -27% test tests::all_upper ... bench: 1,629 ns/iter (+/- 22) test tests::all_upper_new ... bench: 1,241 ns/iter (+/- 9) -24% test tests::realistic_upper ... bench: 1,670 ns/iter (+/- 24) test tests::realistic_upper_new ... bench: 1,241 ns/iter (+/- 17) -26% test tests::sigmas ... bench: 2,053 ns/iter (+/- 20) test tests::sigmas_new ... bench: 1,753 ns/iter (+/- 23) -15% test tests::word_upper ... bench: 1,873 ns/iter (+/- 30) test tests::word_upper_new ... bench: 1,412 ns/iter (+/- 25) -25% ``` I gave up on the more advanced method from rust-lang#52061 as it wasn't always a clear improvement and would help in even less cases if this PR was merged.

Only convert chars that need it in str case conversion methods.

a7c0897

rust-highfive assigned cramertj Jul 4, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 4, 2018

Enable feature slice_index_methods in liballoc

3e49134

Trim trailing whitespace

fde8e59

Pazzaz closed this Jul 5, 2018

Pazzaz mentioned this pull request Jul 6, 2018

Handle array manually in str case conversion methods #52116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only convert chars when needed in str case conversion methods. #52061

Only convert chars when needed in str case conversion methods. #52061

Pazzaz commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

Pazzaz commented Jul 4, 2018 •

edited

Loading

Only convert chars when needed in str case conversion methods. #52061

Only convert chars when needed in str case conversion methods. #52061

Conversation

Pazzaz commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

rust-highfive commented Jul 4, 2018

Pazzaz commented Jul 4, 2018 • edited Loading

Pazzaz commented Jul 4, 2018 •

edited

Loading