about summary refs log tree commit diff
diff options
context:
space:
mode:
authorGuillaume Gomez <guillaume1.gomez@gmail.com>2021-02-17 20:37:55 +0100
committerGitHub <noreply@github.com>2021-02-17 20:37:55 +0100
commit253631d73fdb310a437edff1134caee904e28b94 (patch)
tree3d409191546f08dc3bc596c7c158c3cb59fe3598
parentec007845cfe6a3c54aa44468df9ff2be05fe25b8 (diff)
parentd2ba68b24eba7e763b5e0937ab1ef6dcb5a09ca3 (diff)
downloadrust-253631d73fdb310a437edff1134caee904e28b94.tar.gz
rust-253631d73fdb310a437edff1134caee904e28b94.zip
Rollup merge of #82094 - gilescope:to_digit_speedup2, r=m-ou-se
To digit simplification

I found out the other day that all the ascii digits have the first four bits as one would hope them to. (Eg. char `2` ends `0b0010`). There are two bits to indicate it's in the digit range ( `0b0011_0000`). If it is a true digit then all the higher bits aside from these two will be 0 (as ascii is the lowest part of the unicode u32 spectrum). So XORing with `0b11_0000` should mean we either get the number 0-9 or alternativly we get a larger number in the u32 space. If we get something that's not 0-9 then it will be discarded as it will be greater than the radix.

The code seems so fast though that there's quite a lot of noise in the benchmarks so it's not that easy to prove conclusively that it's faster as well as less instructions.

The non-fast path I was toying with as well wondering if we could do this as then we'd only have one return and less instructions still:
```
           match self {
                'a'..='z' => self as u32 - 'a' as u32 + 10,
                'A'..='Z' => self as u32 - 'A' as u32 + 10,
                _ => { radix = 10; self as u32 ^ ASCII_DIGIT_MASK},
            }
```

Here's the [godbolt](https://godbolt.org/z/883c9n).

( H/T to ``@byteshadow`` for pointing out xor was what I needed)
-rw-r--r--library/core/src/char/methods.rs12
1 files changed, 5 insertions, 7 deletions
diff --git a/library/core/src/char/methods.rs b/library/core/src/char/methods.rs
index e450240527a..64ae7db0d9b 100644
--- a/library/core/src/char/methods.rs
+++ b/library/core/src/char/methods.rs
@@ -1,5 +1,6 @@
 //! impl char {}
 
+use crate::intrinsics::likely;
 use crate::slice;
 use crate::str::from_utf8_unchecked_mut;
 use crate::unicode::printable::is_printable;
@@ -330,16 +331,13 @@ impl char {
     #[stable(feature = "rust1", since = "1.0.0")]
     #[inline]
     pub fn to_digit(self, radix: u32) -> Option<u32> {
+        assert!(radix <= 36, "to_digit: radix is too high (maximum 36)");
         // the code is split up here to improve execution speed for cases where
         // the `radix` is constant and 10 or smaller
-        let val = if radix <= 10 {
-            match self {
-                '0'..='9' => self as u32 - '0' as u32,
-                _ => return None,
-            }
+        let val = if likely(radix <= 10) {
+            // If not a digit, a number greater than radix will be created.
+            (self as u32).wrapping_sub('0' as u32)
         } else {
-            assert!(radix <= 36, "to_digit: radix is too high (maximum 36)");
-
             match self {
                 '0'..='9' => self as u32 - '0' as u32,
                 'a'..='z' => self as u32 - 'a' as u32 + 10,