Skip to content

Commit 253631d

Browse files
Rollup merge of #82094 - gilescope:to_digit_speedup2, r=m-ou-se
To digit simplification I found out the other day that all the ascii digits have the first four bits as one would hope them to. (Eg. char `2` ends `0b0010`). There are two bits to indicate it's in the digit range ( `0b0011_0000`). If it is a true digit then all the higher bits aside from these two will be 0 (as ascii is the lowest part of the unicode u32 spectrum). So XORing with `0b11_0000` should mean we either get the number 0-9 or alternativly we get a larger number in the u32 space. If we get something that's not 0-9 then it will be discarded as it will be greater than the radix. The code seems so fast though that there's quite a lot of noise in the benchmarks so it's not that easy to prove conclusively that it's faster as well as less instructions. The non-fast path I was toying with as well wondering if we could do this as then we'd only have one return and less instructions still: ``` match self { 'a'..='z' => self as u32 - 'a' as u32 + 10, 'A'..='Z' => self as u32 - 'A' as u32 + 10, _ => { radix = 10; self as u32 ^ ASCII_DIGIT_MASK}, } ``` Here's the [godbolt](https://godbolt.org/z/883c9n). ( H/T to ``@byteshadow`` for pointing out xor was what I needed)
2 parents ec00784 + d2ba68b commit 253631d

File tree

1 file changed

+5
-7
lines changed

1 file changed

+5
-7
lines changed

library/core/src/char/methods.rs

+5-7
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
//! impl char {}
22
3+
use crate::intrinsics::likely;
34
use crate::slice;
45
use crate::str::from_utf8_unchecked_mut;
56
use crate::unicode::printable::is_printable;
@@ -330,16 +331,13 @@ impl char {
330331
#[stable(feature = "rust1", since = "1.0.0")]
331332
#[inline]
332333
pub fn to_digit(self, radix: u32) -> Option<u32> {
334+
assert!(radix <= 36, "to_digit: radix is too high (maximum 36)");
333335
// the code is split up here to improve execution speed for cases where
334336
// the `radix` is constant and 10 or smaller
335-
let val = if radix <= 10 {
336-
match self {
337-
'0'..='9' => self as u32 - '0' as u32,
338-
_ => return None,
339-
}
337+
let val = if likely(radix <= 10) {
338+
// If not a digit, a number greater than radix will be created.
339+
(self as u32).wrapping_sub('0' as u32)
340340
} else {
341-
assert!(radix <= 36, "to_digit: radix is too high (maximum 36)");
342-
343341
match self {
344342
'0'..='9' => self as u32 - '0' as u32,
345343
'a'..='z' => self as u32 - 'a' as u32 + 10,

0 commit comments

Comments
 (0)