about summary refs log tree commit diff
path: root/library/core/src/str/validations.rs
AgeCommit message (Collapse)AuthorLines
2025-07-05use `is_multiple_of` instead of manual moduloFolkert de Vries-1/+1
2025-03-06library: Use size_of from the prelude instead of importedThalia Archibald-2/+1
Use `std::mem::{size_of, size_of_val, align_of, align_of_val}` from the prelude instead of importing or qualifying them. These functions were added to all preludes in Rust 1.80.
2024-11-05add const_eval_select macro to reduce redundancyRalf Jung-10/+7
also move internal const_panic helpers to a better location
2024-11-03remove const-support for align_offsetRalf Jung-6/+20
Operations like is_aligned would return actively wrong results at compile-time, i.e. calling it on the same pointer at compiletime and runtime could yield different results. That's no good. Instead of having hacks to make align_offset kind-of work in const-eval, just use const_eval_select in the few places where it makes sense, which also ensures those places are all aware they need to make sure the fallback behavior is consistent.
2024-07-29Reformat `use` declarations.Nicholas Nethercote-2/+1
The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.
2024-02-12Fix comment in core/src/str/validations.rsPizzasBear-1/+1
2022-08-21Replace most uses of `pointer::offset` with `add` and `sub`Maybe Waffle-2/+2
2022-04-15Make some `usize`-typed masks definition agnostic to the size of `usize`Eduardo Sánchez Muñoz-2/+1
Some masks where defined as ```rust const NONASCII_MASK: usize = 0x80808080_80808080u64 as usize; ``` where it was assumed that `usize` is never wider than 64, which is currently true. To make those constants valid in a hypothetical 128-bit target, these constants have been redefined in an `usize`-width-agnostic way ```rust const NONASCII_MASK: usize = usize::from_ne_bytes([0x80; size_of::<usize>()]); ``` There are already some cases where Rust anticipates the possibility of supporting 128-bit targets, such as not implementing `From<usize>` for `u64`.
2022-02-07Add {floor,ceil}_char_boundary methods to strltdk-13/+0
2021-11-25Saner formatting for UTF8_CHAR_WIDTH tableDavid Tolnay-16/+17
2021-11-21libcore: assume the input of `next_code_point` and `next_code_point_reverse` ↵Eduardo Sánchez Muñoz-16/+28
is UTF-8-like The functions are now `unsafe` and they use `Option::unwrap_unchecked` instead of `unwrap_or_0` `unwrap_or_0` was added in 42357d772b8a3a1ce4395deeac0a5cf1f66e951d. I guess `unwrap_unchecked` was not available back then. Given this example: ```rust pub fn first_char(s: &str) -> Option<char> { s.chars().next() } ``` Previously, the following assembly was produced: ```asm _ZN7example10first_char17ha056ddea6bafad1cE: .cfi_startproc test rsi, rsi je .LBB0_1 movzx edx, byte ptr [rdi] test dl, dl js .LBB0_3 mov eax, edx ret .LBB0_1: mov eax, 1114112 ret .LBB0_3: lea r8, [rdi + rsi] xor eax, eax mov r9, r8 cmp rsi, 1 je .LBB0_5 movzx eax, byte ptr [rdi + 1] add rdi, 2 and eax, 63 mov r9, rdi .LBB0_5: mov ecx, edx and ecx, 31 cmp dl, -33 jbe .LBB0_6 cmp r9, r8 je .LBB0_9 movzx esi, byte ptr [r9] add r9, 1 and esi, 63 shl eax, 6 or eax, esi cmp dl, -16 jb .LBB0_12 .LBB0_13: cmp r9, r8 je .LBB0_14 movzx edx, byte ptr [r9] and edx, 63 jmp .LBB0_16 .LBB0_6: shl ecx, 6 or eax, ecx ret .LBB0_9: xor esi, esi mov r9, r8 shl eax, 6 or eax, esi cmp dl, -16 jae .LBB0_13 .LBB0_12: shl ecx, 12 or eax, ecx ret .LBB0_14: xor edx, edx .LBB0_16: and ecx, 7 shl ecx, 18 shl eax, 6 or eax, ecx or eax, edx ret ``` After this change, the assembly is reduced to: ```asm _ZN7example10first_char17h4318683472f884ccE: .cfi_startproc test rsi, rsi je .LBB0_1 movzx ecx, byte ptr [rdi] test cl, cl js .LBB0_3 mov eax, ecx ret .LBB0_1: mov eax, 1114112 ret .LBB0_3: mov eax, ecx and eax, 31 movzx esi, byte ptr [rdi + 1] and esi, 63 cmp cl, -33 jbe .LBB0_4 movzx edx, byte ptr [rdi + 2] shl esi, 6 and edx, 63 or edx, esi cmp cl, -16 jb .LBB0_7 movzx ecx, byte ptr [rdi + 3] and eax, 7 shl eax, 18 shl edx, 6 and ecx, 63 or ecx, edx or eax, ecx ret .LBB0_4: shl eax, 6 or eax, esi ret .LBB0_7: shl eax, 12 or eax, edx ret ```
2021-11-18Make slice->str conversion and related functions constMaybe Waffle-9/+10
This commit makes the following functions from `core::str` `const fn`: - `from_utf8[_mut]` (`feature(const_str_from_utf8)`) - `from_utf8_unchecked_mut` (`feature(const_str_from_utf8_unchecked_mut)`) - `Utf8Error::{valid_up_to,error_len}` (`feature(const_str_from_utf8)`)
2021-10-31Rollup merge of #89897 - jkugelman:must-use-core, r=joshtriplettMatthias Krüger-0/+1
Add #[must_use] to remaining core functions I've run out of compelling reasons to group functions together across crates so I'm just going to go module-by-module. This is everything remaining from the `core` crate. Ignored by clippy for reasons unknown: ```rust core::alloc::Layout unsafe fn for_value_raw<T: ?Sized>(t: *const T) -> Self; core::any const fn type_name_of_val<T: ?Sized>(_val: &T) -> &'static str; ``` Ignored by clippy because of `mut`: ```rust str fn split_at_mut(&mut self, mid: usize) -> (&mut str, &mut str); ``` <del> Ignored by clippy presumably because a caller might want `f` called for side effects. That seems like a bad usage of `map` to me. ```rust core::cell::Ref<'b, T> fn map<U: ?Sized, F>(orig: Ref<'b, T>, f: F) -> Ref<'b, T>; core::cell::Ref<'b, T> fn map_split<U: ?Sized, V: ?Sized, F>(orig: Ref<'b, T>, f: F) -> (Ref<'b, U>, Ref<'b, V>); ``` </del> Parent issue: #89692 r? ```@joshtriplett```
2021-10-30Add #[must_use] to remaining core functionsJohn Kugelman-0/+1
2021-10-27replace `|` with `||` in string validationPietro Albini-1/+1
Using short-circuiting operators makes it easier to perform some kinds of source code analysis, like MC/DC code coverage (a requirement in safety-critical environments). The optimized x86_64 assembly is equivalent between the old and new versions. Old assembly of that condition: ``` mov rax, qword ptr [rdi + rdx + 8] or rax, qword ptr [rdi + rdx] test rax, r9 je .LBB0_7 ``` New assembly of that condition: ``` mov rax, qword ptr [rdi + rdx] or rax, qword ptr [rdi + rdx + 8] test rax, r8 je .LBB0_7 ```
2021-09-11manually inline functionThe8472-4/+4
2021-09-11optimization continuation byte validation of strings containing multibyte charsThe8472-6/+4
``` old, -O2, x86-64 test str::str_validate_emoji ... bench: 4,606 ns/iter (+/- 64) new, -O2, x86-64 test str::str_validate_emoji ... bench: 3,837 ns/iter (+/- 60) ```
2021-09-11optimize utf8_is_cont_byte() to speed up str.chars().count()The8472-1/+1
it shows consistent improvements across several x86_64 feature levels ``` old, -O2, x86-64 test str::str_char_count_emoji ... bench: 1,924 ns/iter (+/- 26) test str::str_char_count_lorem ... bench: 879 ns/iter (+/- 12) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64 test str::str_char_count_emoji ... bench: 1,878 ns/iter (+/- 21) test str::str_char_count_lorem ... bench: 851 ns/iter (+/- 11) test str::str_char_count_lorem_short ... bench: 4 ns/iter (+/- 0) old, -O2, x86-64-v2 test str::str_char_count_emoji ... bench: 1,477 ns/iter (+/- 46) test str::str_char_count_lorem ... bench: 675 ns/iter (+/- 15) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64-v2 test str::str_char_count_emoji ... bench: 1,323 ns/iter (+/- 39) test str::str_char_count_lorem ... bench: 593 ns/iter (+/- 18) test str::str_char_count_lorem_short ... bench: 4 ns/iter (+/- 0) old, -O2, x86-64-v3 test str::str_char_count_emoji ... bench: 748 ns/iter (+/- 7) test str::str_char_count_lorem ... bench: 348 ns/iter (+/- 2) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64-v3 test str::str_char_count_emoji ... bench: 650 ns/iter (+/- 4) test str::str_char_count_lorem ... bench: 301 ns/iter (+/- 1) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) ```
2020-11-18Remove semicolon from internal `err` macroAaron Hill-1/+1
This macro is used in expression position (a match arm), and only compiles because of #33953 Regardless of what happens with that issue, this makes the usage of the macro less confusing at the call site.
2020-09-26Move utf-8 validating helpers to new modLzu Tao-0/+275