about summary refs log tree commit diff
path: root/library/core/src/char/methods.rs
AgeCommit message (Collapse)AuthorLines
2025-09-07optimization: Don't include ASCII characters in Unicode tablesKarl Meakin-1/+37
The ASCII subset of Unicode is fixed and will never change, so we don't need to generate tables for it with every new Unicode version. This saves a few bytes of static data and speeds up `char::is_control` and `char::is_grapheme_extended` on ASCII inputs. Since the table lookup functions exported from the `unicode` module will give nonsensical errors on ASCII input (and in fact will panic in debug mode), I had to add some private wrapper methods to `char` which check for ASCII-ness first.
2025-08-30Auto merge of #145479 - Kmeakin:km/hardcode-char-is-control, r=joboetbors-1/+5
Hard-code `char::is_control` Split off from https://github.com/rust-lang/rust/pull/145219 According to https://www.unicode.org/policies/stability_policy.html#Property_Value, the set of codepoints in `Cc` will never change. So we can hard-code the patterns to match against instead of using a table. This doesn't change the generated assembly, since the lookup table is small enough that[ LLVM is able to inline the whole search](https://godbolt.org/z/bG8dM37YG). But this does reduce the chance of regressions if LLVM's heuristics change in the future, and means less generated Rust code checked in to `unicode-data.rs`.
2025-08-17Optimize `char::encode_utf8`Karl Meakin-21/+26
Save a few instructions in `encode_utf8_raw_unchecked` by performing manual CSE.
2025-08-16refactor: Hard-code `char::is_control`Karl Meakin-1/+5
According to https://www.unicode.org/policies/stability_policy.html#Property_Value, the set of codepoints in `Cc` will never change. So we can hard-code the patterns to match against instead of using a table.
2025-08-07Optimize `char::is_alphanumeric`Karl Meakin-1/+5
Avoid an unnecessary call to `unicode::Alphabetic` when `self` is an ASCII digit (ie `0..=9`).
2025-06-20Add diagnostic items for ClippySamuel Tardieu-0/+1
2025-05-16Add assert_unsafe_precondition!()s to as_ascii_unchecked() methodssam skeoch-0/+7
2025-05-16Add as_ascii_unchecked() methods to char, str, and u8sam skeoch-0/+14
2025-04-10Auto merge of #139279 - BoxyUwU:bump-boostrap, r=jieyouxubors-2/+2
Bump boostrap compiler to new beta try-job: `*msvc*`
2025-04-09replace version placeholderBoxy-2/+2
2025-04-09Speed up `String::push` and `String::insert`lincot-29/+61
Improve performance of `String` methods by avoiding unnecessary memcpy for the character bytes, with added codegen check to ensure compliance.
2025-03-16Rollup merge of #138082 - thaliaarchi:slice-cfg-not-test, r=thomcc许杰友 Jieyou Xu (Joe)-1/+1
Remove `#[cfg(not(test))]` gates in `core` These gates are unnecessary now that unit tests for `core` are in a separate package, `coretests`, instead of in the same files as the source code. They previously prevented the two `core` versions from conflicting with each other.
2025-03-06stabilize const_char_classifyRalf Jung-2/+2
2025-03-06Remove #[cfg(not(test))] gates in coreThalia Archibald-1/+1
These gates are unnecessary now that unit tests for `core` are in a separate package, `coretests`, instead of in the same files as the source code. They previously prevented the two `core` versions from conflicting with each other.
2025-02-19Rollup merge of #120580 - HTGAzureX1212:HTGAzureX1212/issue-45795, r=m-ou-seMatthias Krüger-0/+10
Add `MAX_LEN_UTF8` and `MAX_LEN_UTF16` Constants This pull request adds the `MAX_LEN_UTF8` and `MAX_LEN_UTF16` constants as per #45795, gated behind the `char_max_len` feature. The constants are currently applied in the `alloc`, `core` and `std` libraries.
2025-02-18add last std diagnostic items for clippycyrgani-0/+1
2025-02-16add MAX_LEN_UTF8 and MAX_LEN_UTF16 constantsHTGAzureX1212-0/+10
2025-01-31Update encode_utf16 to mention it is native endianMarijn Schouten-3/+3
2024-12-31char to_digit: avoid unnecessary casts to u64Marcondiro-7/+11
2024-11-27update cfgsBoxy-5/+0
2024-11-27replace placeholder versionBoxy-4/+4
2024-11-14Auto merge of #132709 - programmerjake:optimize-charto_digit, r=joshtriplettbors-13/+31
optimize char::to_digit and assert radix is at least 2 approved by t-libs: https://github.com/rust-lang/libs-team/issues/475#issuecomment-2457858458 let me know if this needs an assembly test or similar.
2024-11-12stabilize const_unicode_case_lookupRalf Jung-4/+2
2024-11-06optimize char::to_digit and assert radix is at least 2Jacob Lifshay-13/+31
approved by t-libs: https://github.com/rust-lang/libs-team/issues/475#issuecomment-2457858458
2024-11-06Auto merge of #132500 - RalfJung:char-is-whitespace-const, r=jhprattbors-2/+3
make char::is_whitespace unstably const I am adding this to the existing https://github.com/rust-lang/rust/issues/132241 feature gate, since `is_digit` and `is_whitespace` seem similar enough that one can group them together.
2024-11-05Auto merge of #132661 - matthiaskrgr:rollup-npytbl6, r=matthiaskrgrbors-1/+1
Rollup of 8 pull requests Successful merges: - #132259 (rustc_codegen_llvm: Add a new 'pc' option to branch-protection) - #132409 (CI: switch 7 linux jobs to free runners) - #132498 (Suggest fixing typos and let bindings at the same time) - #132524 (chore(style): sync submodule exclusion list between tidy and rustfmt) - #132567 (Properly suggest `E::assoc` when we encounter `E::Variant::assoc`) - #132571 (add const_eval_select macro to reduce redundancy) - #132637 (Do not filter empty lint passes & re-do CTFE pass) - #132642 (Add documentation on `ast::Attribute`) r? `@ghost` `@rustbot` modify labels: rollup
2024-11-05add const_eval_select macro to reduce redundancyRalf Jung-1/+1
also move internal const_panic helpers to a better location
2024-11-04Stabilise 'const_char_encode_utf16';Gabriel Bjørnager Jensen-2/+5
2024-11-03Auto merge of #132542 - RalfJung:const_panic, r=tgross35bors-24/+19
add const_panic macro to make it easier to fall back to non-formatting panic in const Suggested by `@tgross35` r? `@tgross35`
2024-11-03add const_panic macro to make it easier to fall back to non-formatting panic ↵Ralf Jung-24/+19
in const
2024-11-02make char::is_whitespace unstably constRalf Jung-2/+3
2024-11-02get rid of a whole bunch of unnecessary rustc_const_unstable attributesRalf Jung-1/+0
2024-10-27Support `char::is_digit` in const contextsultrabear-1/+2
2024-10-25Re-do recursive const stability checksRalf Jung-1/+1
Fundamentally, we have *three* disjoint categories of functions: 1. const-stable functions 2. private/unstable functions that are meant to be callable from const-stable functions 3. functions that can make use of unstable const features This PR implements the following system: - `#[rustc_const_stable]` puts functions in the first category. It may only be applied to `#[stable]` functions. - `#[rustc_const_unstable]` by default puts functions in the third category. The new attribute `#[rustc_const_stable_indirect]` can be added to such a function to move it into the second category. - `const fn` without a const stability marker are in the second category if they are still unstable. They automatically inherit the feature gate for regular calls, it can now also be used for const-calls. Also, several holes in recursive const stability checking are being closed. There's still one potential hole that is hard to avoid, which is when MIR building automatically inserts calls to a particular function in stable functions -- which happens in the panic machinery. Those need to *not* be `rustc_const_unstable` (or manually get a `rustc_const_stable_indirect`) to be sure they follow recursive const stability. But that's a fairly rare and special case so IMO it's fine. The net effect of this is that a `#[unstable]` or unmarked function can be constified simply by marking it as `const fn`, and it will then be const-callable from stable `const fn` and subject to recursive const stability requirements. If it is publicly reachable (which implies it cannot be unmarked), it will be const-unstable under the same feature gate. Only if the function ever becomes `#[stable]` does it need a `#[rustc_const_unstable]` or `#[rustc_const_stable]` marker to decide if this should also imply const-stability. Adding `#[rustc_const_unstable]` is only needed for (a) functions that need to use unstable const lang features (including intrinsics), or (b) `#[stable]` functions that are not yet intended to be const-stable. Adding `#[rustc_const_stable]` is only needed for functions that are actually meant to be directly callable from stable const code. `#[rustc_const_stable_indirect]` is used to mark intrinsics as const-callable and for `#[rustc_const_unstable]` functions that are actually called from other, exposed-on-stable `const fn`. No other attributes are required.
2024-10-15update bootstrap configsJosh Stone-4/+0
2024-10-15replace placeholder versionJosh Stone-3/+3
(cherry picked from commit 567fd9610cbfd220844443487059335d7e1ff021)
2024-10-14Stabilise 'const_make_ascii'Gabriel Bjørnager Jensen-2/+4
2024-10-10Stabilise 'const_char_encode_utf8';Gabriel Bjørnager Jensen-2/+5
2024-10-01Rollup merge of #130773 - bjoernager:master, r=thomccMatthias Krüger-2/+1
Update Unicode escapes in `/library/core/src/char/methods.rs` `char::MAX` is inconsistent on how Unicode escapes should be formatted. This PR resolves that.
2024-09-28Update Unicode escapes;Gabriel Bjørnager Jensen-2/+1
2024-09-25Add 'must_use' attribute to 'char::len_utf8' and 'char::len_utf16';Gabriel Bjørnager Jensen-0/+4
2024-09-23Rollup merge of #130713 - bjoernager:const-char-make-ascii, r=NoratriebMatthias Krüger-2/+2
Mark `u8::make_ascii_uppercase` and `u8::make_ascii_lowercase` as const. Relevant tracking issue: #130698 This PR extends #130697 by also marking the `make_ascii_uppercase` and `make_ascii_lowercase` methods in `u8` as const. The `const_char_make_ascii` feature gate is additionally renamed to `const_make_ascii`.
2024-09-23Rollup merge of #130659 - bjoernager:const-char-encode-utf16, r=dtolnayMatthias Krüger-25/+35
Support `char::encode_utf16` in const scenarios. Relevant tracking issue: #130660 The method `char::encode_utf16` should be marked "const" to allow compile-time conversions. This PR additionally rewrites the `encode_utf16_raw` function for better readability whilst also reducing the amount of unsafe code. try-job: x86_64-msvc
2024-09-22Mark 'make_ascii_uppercase' and 'make_ascii_lowercase' in 'u8' as const; ↵Gabriel Bjørnager Jensen-2/+2
Rename 'const_char_make_ascii' feature gate to 'const_make_ascii';
2024-09-22Auto merge of #130697 - bjoernager:const-char-make-ascii, r=dtolnaybors-2/+4
Mark `char::make_ascii_uppercase` and `char::make_ascii_lowercase` as const. Relevant tracking issue: #130698 The `make_ascii_uppercase` and `make_ascii_lowercase` methods in `char` should be marked "const." With the stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), this simply requires adding the `const` specifier to the function signatures.
2024-09-22Mark 'make_ascii_uppercase' and 'make_ascii_lowercase' in 'char' as const;Gabriel Bjørnager Jensen-2/+4
2024-09-21Mark and implement 'char::encode_utf16' as const; Rewrite 'encode_utf16_raw';Gabriel Bjørnager Jensen-25/+35
2024-09-20Address diagnostics regression for 'const_char_encode_utf8';Gabriel Bjørnager Jensen-2/+12
2024-09-18Mark and implement 'char::encode_utf8' as const.Gabriel Bjørnager Jensen-18/+14
2024-09-12Rollup merge of #130101 - RalfJung:const-cleanup, r=fee1-deadMatthias Krüger-1/+1
some const cleanup: remove unnecessary attributes, add const-hack indications I learned that we use `FIXME(const-hack)` on top of the "const-hack" label. That seems much better since it marks the right place in the code and moves around with the code. So I went through the PRs with that label and added appropriate FIXMEs in the code. IMO this means we can then remove the label -- Cc ``@rust-lang/wg-const-eval.`` I also noticed some const stability attributes that don't do anything useful, and removed them. r? ``@fee1-dead``