about summary refs log tree commit diff
diff options
context:
space:
mode:
authorbors <bors@rust-lang.org>2024-03-05 10:28:55 +0000
committerbors <bors@rust-lang.org>2024-03-05 10:28:55 +0000
commitbdde2a80aef587cdbb8eb2d6e295d5c1d05830d9 (patch)
treedc466492548dbb9388c10423a0b5c80a8d0119a4
parent41d97c8a5dea2731b0e56fe97cd7cb79e21cff79 (diff)
parent8eaaa6e610d92e2b29ef1cf46a796cc27c96428d (diff)
downloadrust-bdde2a80aef587cdbb8eb2d6e295d5c1d05830d9.tar.gz
rust-bdde2a80aef587cdbb8eb2d6e295d5c1d05830d9.zip
Auto merge of #121138 - Swatinem:grapheme-extend-ascii, r=cuviper
Add ASCII fast-path for `char::is_grapheme_extended`

I discovered that `impl Debug for str` is quite slow because it ends up doing a `unicode_data::grapheme_extend::lookup` for each char, which ends up doing a binary search.

This introduces a fast-path for ASCII chars which do not have this property.

The `lookup` is thus completely gone from profiles.

---

As a followup, maybe it’s worth implementing this fast path directly in `unicode_data` so that it can check for the lower bound directly before going to a potentially expensive binary search.
-rw-r--r--library/core/src/char/methods.rs2
1 files changed, 1 insertions, 1 deletions
diff --git a/library/core/src/char/methods.rs b/library/core/src/char/methods.rs
index a93b94867ce..65ae4831839 100644
--- a/library/core/src/char/methods.rs
+++ b/library/core/src/char/methods.rs
@@ -927,7 +927,7 @@ impl char {
     #[must_use]
     #[inline]
     pub(crate) fn is_grapheme_extended(self) -> bool {
-        unicode::Grapheme_Extend(self)
+        self > '\x7f' && unicode::Grapheme_Extend(self)
     }
 
     /// Returns `true` if this `char` has one of the general categories for numbers.