about summary refs log tree commit diff
path: root/src/libcore/str
AgeCommit message (Collapse)AuthorLines
2019-08-08Use associated_type_bounds where applicable - closes #61738Ilija Tovilo-42/+74
2019-07-29comments from @lzutaoMaximilian Roos-1/+1
2019-07-29impl Debug for CharsMaximilian Roos-1/+11
2019-07-18Fix clippy::len_zero warningsMateusz Mikuła-2/+2
2019-07-17Auto merge of #61339 - jridgewell:pointer-alignment, r=BurntSushibors-6/+3
Optimize pointer alignment in utf8 validation This uses (and reuses) the u8 arrays's inherent block alignment when checking whether the current index is block aligned. I initially thought that this would just move the expensive `align_offset` call out of the while loop and replace it with a subtraction and bitwise AND. But it appears this optimizes much better, too... before: https://rust.godbolt.org/z/WIPvWl after: https://rust.godbolt.org/z/-jBPoW ## Benchmarks https://github.com/jridgewell/faster-from_utf8/tree/pointer-alignment ``` test from_utf8_2_bytes_fast ... bench: 310 ns/iter (+/- 42) = 1290 MB/s test from_utf8_2_bytes_regular ... bench: 309 ns/iter (+/- 24) = 1294 MB/s test from_utf8_3_bytes_fast ... bench: 1,027 ns/iter (+/- 62) = 1168 MB/s test from_utf8_3_bytes_regular ... bench: 1,513 ns/iter (+/- 611) = 793 MB/s test from_utf8_4_bytes_fast ... bench: 1,788 ns/iter (+/- 26) = 1342 MB/s test from_utf8_4_bytes_regular ... bench: 1,907 ns/iter (+/- 181) = 1258 MB/s test from_utf8_all_bytes_fast ... bench: 3,463 ns/iter (+/- 97) = 1155 MB/s test from_utf8_all_bytes_regular ... bench: 4,083 ns/iter (+/- 89) = 979 MB/s test from_utf8_ascii_fast ... bench: 88 ns/iter (+/- 4) = 28988 MB/s test from_utf8_ascii_regular ... bench: 88 ns/iter (+/- 8) = 28988 MB/s test from_utf8_cyr_fast ... bench: 7,707 ns/iter (+/- 531) = 665 MB/s test from_utf8_cyr_regular ... bench: 8,202 ns/iter (+/- 135) = 625 MB/s test from_utf8_enwik8_fast ... bench: 1,135,756 ns/iter (+/- 84,450) = 8804 MB/s test from_utf8_enwik8_regular ... bench: 1,145,468 ns/iter (+/- 79,601) = 8730 MB/s test from_utf8_jawik10_fast ... bench: 12,723,844 ns/iter (+/- 473,247) = 785 MB/s test from_utf8_jawik10_regular ... bench: 13,384,596 ns/iter (+/- 666,997) = 747 MB/s test from_utf8_mixed_fast ... bench: 2,321 ns/iter (+/- 123) = 2081 MB/s test from_utf8_mixed_regular ... bench: 2,702 ns/iter (+/- 408) = 1788 MB/s test from_utf8_mostlyasc_fast ... bench: 249 ns/iter (+/- 10) = 14666 MB/s test from_utf8_mostlyasc_regular ... bench: 276 ns/iter (+/- 5) = 13231 MB/s ```
2019-07-05Do not use pointer alignment on unsupported platformsJustin Ridgewell-1/+1
2019-07-04Rollup merge of #62316 - khuey:efficient_last, r=sfacklerMazdak Farrokhzad-0/+15
When possible without changing semantics, implement Iterator::last in terms of DoubleEndedIterator::next_back for types in liballoc and libcore. Provided that the iterator has finite length and does not trigger user-provided code, this is safe. What follows is a full list of the DoubleEndedIterators in liballoc/libcore and whether this optimization is safe, and if not, why not. src/liballoc/boxed.rs Box: Pass through to avoid defeating optimization of the underlying DoubleIterator implementation. This has no correctness impact. src/liballoc/collections/binary_heap.rs Iter: Pass through to avoid defeating optimizations on slice::Iter IntoIter: Not safe, changes Drop order Drain: Not safe, changes Drop order src/liballoc/collections/btree/map.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Keys: Safe to call next_back, invokes no user defined code. Values: ditto ValuesMut: ditto Range: ditto RangeMut: ditto src/liballoc/collections/btree/set.rs Iter: Safe to call next_back, invokes no user defined code. IntoIter: Not safe, changes Drop order Range: Safe to call next_back, invokes no user defined code. src/liballoc/collections/linked_list.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order src/liballoc/collections/vec_deque.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Drain: ditto src/liballoc/string.rs Drain: Safe because return type is a primitive (char) src/liballoc/vec.rs IntoIter: Not safe, changes Drop order Drain: ditto Splice: ditto src/libcore/ascii.rs EscapeDefault: Safe because return type is a primitive (u8) src/libcore/iter/adapters/chain.rs Chain: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/adapters/flatten.rs FlatMap: Not safe, invokes user defined code (Iterator impl) Flatten: ditto FlattenCompat: ditto src/libcore/iter/adapters/mod.rs Rev: Not safe, invokes user defined code (Iterator impl) Copied: ditto Cloned: Not safe, invokes user defined code (Iterator impl and T::clone) Map: Not safe, invokes user defined code (Iterator impl + closure) Filter: ditto FilterMap: ditto Enumerate: Not safe, invokes user defined code (Iterator impl) Skip: ditto Fuse: ditto Inspect: ditto src/libcore/iter/adapters/zip.rs Zip: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/range.rs ops::Range: Not safe, changes Drop order, but ALREADY HAS SPECIALIZATION ops::RangeInclusive: ditto src/libcore/iter/sources.rs Repeat: Not safe, calling last should iloop. Empty: No point, iterator is at most one item long. Once: ditto OnceWith: ditto src/libcore/option.rs Item: No point, iterator is at most one item long. Iter: ditto IterMut: ditto IntoIter: ditto src/libcore/result.rs Iter: No point, iterator is at most one item long IterMut: ditto IntoIter: ditto src/libcore/slice/mod.rs Split: Not safe, invokes user defined closure SplitMut: ditto RSplit: ditto RSplitMut: ditto Windows: Safe, already has specialization Chunks: ditto ChunksMut: ditto ChunksExact: ditto ChunksExactMut: ditto RChunks: ditto RChunksMut: ditto RChunksExact: ditto RChunksExactMut: ditto src/libcore/str/mod.rs Chars: Safe, already has specialization CharIndices: ditto Bytes: ditto Lines: Safe to call next_back, invokes no user defined code. LinesAny: Deprecated Everything that is generic over P: Pattern: Not safe because Pattern invokes user defined code. SplitWhitespace: Safe to call next_back, invokes no user defined code. SplitAsciiWhitespace: ditto This is attempt 2 of #60130. r? @sfackler
2019-07-03Fixed document bug, those replaced each otherKohei Takahashi-16/+16
Introduced by #58005
2019-07-02When possible without changing semantics, implement Iterator::last in terms ↵Kyle Huey-0/+15
of DoubleEndedIterator::next_back for types in liballoc and libcore. Provided that the iterator has finite length and does not trigger user-provided code, this is safe. What follows is a full list of the DoubleEndedIterators in liballoc/libcore and whether this optimization is safe, and if not, why not. src/liballoc/boxed.rs Box: Pass through to avoid defeating optimization of the underlying DoubleIterator implementation. This has no correctness impact. src/liballoc/collections/binary_heap.rs Iter: Pass through to avoid defeating optimizations on slice::Iter IntoIter: Not safe, changes Drop order Drain: Not safe, changes Drop order src/liballoc/collections/btree/map.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Keys: Safe to call next_back, invokes no user defined code. Values: ditto ValuesMut: ditto Range: ditto RangeMut: ditto src/liballoc/collections/btree/set.rs Iter: Safe to call next_back, invokes no user defined code. IntoIter: Not safe, changes Drop order Range: Safe to call next_back, invokes no user defined code. src/liballoc/collections/linked_list.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order src/liballoc/collections/vec_deque.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Drain: ditto src/liballoc/string.rs Drain: Safe because return type is a primitive (char) src/liballoc/vec.rs IntoIter: Not safe, changes Drop order Drain: ditto Splice: ditto src/libcore/ascii.rs EscapeDefault: Safe because return type is a primitive (u8) src/libcore/iter/adapters/chain.rs Chain: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/adapters/flatten.rs FlatMap: Not safe, invokes user defined code (Iterator impl) Flatten: ditto FlattenCompat: ditto src/libcore/iter/adapters/mod.rs Rev: Not safe, invokes user defined code (Iterator impl) Copied: ditto Cloned: Not safe, invokes user defined code (Iterator impl and T::clone) Map: Not safe, invokes user defined code (Iterator impl + closure) Filter: ditto FilterMap: ditto Enumerate: Not safe, invokes user defined code (Iterator impl) Skip: ditto Fuse: ditto Inspect: ditto src/libcore/iter/adapters/zip.rs Zip: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/range.rs ops::Range: Not safe, changes Drop order, but ALREADY HAS SPECIALIZATION ops::RangeInclusive: ditto src/libcore/iter/sources.rs Repeat: Not safe, calling last should iloop. Empty: No point, iterator is at most one item long. Once: ditto OnceWith: ditto src/libcore/option.rs Item: No point, iterator is at most one item long. Iter: ditto IterMut: ditto IntoIter: ditto src/libcore/result.rs Iter: No point, iterator is at most one item long IterMut: ditto IntoIter: ditto src/libcore/slice/mod.rs Split: Not safe, invokes user defined closure SplitMut: ditto RSplit: ditto RSplitMut: ditto Windows: Safe, already has specialization Chunks: ditto ChunksMut: ditto ChunksExact: ditto ChunksExactMut: ditto RChunks: ditto RChunksMut: ditto RChunksExact: ditto RChunksExactMut: ditto src/libcore/str/mod.rs Chars: Safe, already has specialization CharIndices: ditto Bytes: ditto Lines: Safe to call next_back, invokes no user defined code. LinesAny: Deprecated Everything that is generic over P: Pattern: Not safe because Pattern invokes user defined code. SplitWhitespace: Safe to call next_back, invokes no user defined code. SplitAsciiWhitespace: ditto
2019-06-10make sure to_ascii_lowercase actually leaves upper-case non-ASCII characters ↵Ralf Jung-2/+2
alone
2019-06-08Add examples for make_ascii_{uppercase, lowercase}Napen123-0/+20
2019-05-29Optimize pointer alignment in utf8 validationJustin Ridgewell-6/+3
This uses (and reuses) the u8 arrays's inherent block alignment when checking whether the current index is block aligned. I initially thought that this would just move the expensive `align_offset` call out of the while loop and replace it with a subtraction and bitwise AND. But it appears this optimizes much better, too... before: https://rust.godbolt.org/z/WIPvWl after: https://rust.godbolt.org/z/-jBPoW https://github.com/jridgewell/faster-from_utf8/tree/pointer-alignment ``` test from_utf8_2_bytes_fast ... bench: 310 ns/iter (+/- 42) = 1290 MB/s test from_utf8_2_bytes_regular ... bench: 309 ns/iter (+/- 24) = 1294 MB/s test from_utf8_3_bytes_fast ... bench: 1,027 ns/iter (+/- 62) = 1168 MB/s test from_utf8_3_bytes_regular ... bench: 1,513 ns/iter (+/- 611) = 793 MB/s test from_utf8_4_bytes_fast ... bench: 1,788 ns/iter (+/- 26) = 1342 MB/s test from_utf8_4_bytes_regular ... bench: 1,907 ns/iter (+/- 181) = 1258 MB/s test from_utf8_all_bytes_fast ... bench: 3,463 ns/iter (+/- 97) = 1155 MB/s test from_utf8_all_bytes_regular ... bench: 4,083 ns/iter (+/- 89) = 979 MB/s test from_utf8_ascii_fast ... bench: 88 ns/iter (+/- 4) = 28988 MB/s test from_utf8_ascii_regular ... bench: 88 ns/iter (+/- 8) = 28988 MB/s test from_utf8_cyr_fast ... bench: 7,707 ns/iter (+/- 531) = 665 MB/s test from_utf8_cyr_regular ... bench: 8,202 ns/iter (+/- 135) = 625 MB/s test from_utf8_enwik8_fast ... bench: 1,135,756 ns/iter (+/- 84,450) = 8804 MB/s test from_utf8_enwik8_regular ... bench: 1,145,468 ns/iter (+/- 79,601) = 8730 MB/s test from_utf8_jawik10_fast ... bench: 12,723,844 ns/iter (+/- 473,247) = 785 MB/s test from_utf8_jawik10_regular ... bench: 13,384,596 ns/iter (+/- 666,997) = 747 MB/s test from_utf8_mixed_fast ... bench: 2,321 ns/iter (+/- 123) = 2081 MB/s test from_utf8_mixed_regular ... bench: 2,702 ns/iter (+/- 408) = 1788 MB/s test from_utf8_mostlyasc_fast ... bench: 249 ns/iter (+/- 10) = 14666 MB/s test from_utf8_mostlyasc_regular ... bench: 276 ns/iter (+/- 5) = 13231 MB/s ```
2019-05-22Revert "Add implementations of last in terms of next_back on a bunch of ↵Steven Fackler-20/+0
DoubleEndedIterators." This reverts commit 3e86cf36b5114f201868bf459934fe346a76a2d4.
2019-05-14Rollup merge of #60443 - RalfJung:as_ptr, r=SimonSapinMazdak Farrokhzad-0/+4
as_ptr returns a read-only pointer Add comments to `as_ptr` methods to warn that these are read-only pointers, and writing to them is UB. [It was pointed out](https://internals.rust-lang.org/t/as-ptr-vs-as-mut-ptr/9940) that `CStr` does not even have an `as_mut_ptr`. I originally was going to add one, but there is no method at all that would mutate a `CStr`. Was that a deliberate choice or should I add an `as_mut_ptr` (similar to [what I did for `str`](https://github.com/rust-lang/rust/pull/58200))?
2019-05-14Rollup merge of #60130 - khuey:efficient_last, r=sfacklerMazdak Farrokhzad-0/+20
Add implementations of last in terms of next_back on a bunch of DoubleEndedIterators Provided a `DoubleEndedIterator` has finite length, `Iterator::last` is equivalent to `DoubleEndedIterator::next_back`. But searching forwards through the iterator when it's unnecessary is obviously not good for performance. I ran into this on one of the collection iterators. I tried adding appropriate overloads for a bunch of the iterator adapters like filter, map, etc, but I ran into a lot of type inference failures after doing so. The other interesting case is what to do with `Repeat`. Do we consider it part of the contract that `Iterator::last` will loop forever on it? The docs do say that the iterator will be evaluated until it returns None. This is also relevant for the adapters, it's trivially easy to observe whether a `Map` adapter invoked its closure a zillion times or just once for the last element.
2019-05-02clarify wordingRalf Jung-3/+2
2019-05-01as_ptr returns a read-only pointerRalf Jung-0/+5
2019-04-29Rollup merge of #60356 - JohnTitor:stabilize-as-mut-ptr, r=CentrilMazdak Farrokhzad-1/+1
Stabilize str::as_mut_ptr Closes #58215
2019-04-29Rollup merge of #59946 - mernen:patch-2, r=ehussMazdak Farrokhzad-1/+1
Fix equivalent string in escape_default docs This newline should be escaped.
2019-04-29Stabilize str::as_mut_ptrYuki OKUSHI-1/+1
2019-04-25ignore-tidy-filelength on all files with greater than 3000 linesvarkor-0/+2
2019-04-19Add implementations of last in terms of next_back on a bunch of ↵Kyle Huey-0/+20
DoubleEndedIterators. r?Manishearth
2019-04-19Rollup merge of #60098 - Centril:libcore-deny-more, r=varkorMazdak Farrokhzad-25/+25
libcore: deny `elided_lifetimes_in_paths` r? @varkor
2019-04-19Rollup merge of #60023 - koalatux:nth-back, r=scottmcmMazdak Farrokhzad-0/+5
implement specialized nth_back() for Bytes, Fuse and Enumerate Hi, After my first PR has been successfully merged, here is my second pull request :-) Also this PR contains some specializations for the problem discussed in #54054.
2019-04-19libcore: deny more...Mazdak Farrokhzad-25/+25
2019-04-18libcore => 2018Taiki Endo-20/+19
2019-04-16implement nth_back for BytesAdrian Friedli-0/+5
2019-04-13Fix equivalent string in escape_defaultDaniel Luz-1/+1
2019-04-12Fix paste error in split_ascii_whitespace docs.Scott Olson-1/+1
2019-03-22add suggestions to trim_{left,right} deprecationsAndy Russell-4/+20
2019-03-11Improvements to comments in libstd, libcore, liballoc.Alexander Regueiro-4/+4
2019-02-27Replace `s` with `self` in docs for str methods taking self.Trevor Spiteri-3/+3
2019-02-20Rollup merge of #58553 - scottmcm:more-ihle, r=Centrilkennytm-1/+1
Use more impl header lifetime elision Inspired by seeing explicit lifetimes on these two: - https://doc.rust-lang.org/nightly/std/slice/struct.Iter.html#impl-FusedIterator - https://doc.rust-lang.org/nightly/std/primitive.u32.html#impl-Not And a follow-up to https://github.com/rust-lang/rust/pull/54687, that started using IHLE in libcore. Most of the changes in here fall into two big categories: - Removing lifetimes from common traits that can essentially never user a lifetime from an input (particularly `Drop`, `Debug`, and `Clone`) - Forwarding impls that are only possible because the lifetime doesn't matter (like `impl<R: Read + ?Sized> Read for &mut R`) I omitted things that seemed like they could be more controversial, like the handful of iterators that have a `Item: 'static` despite the iterator having a lifetime or the `PartialEq` implementations [where the flipped one cannot elide the lifetime](https://internals.rust-lang.org/t/impl-type-parameter-aliases/9403/2?u=scottmcm). I also removed two lifetimes that turned out to be completely unused; see https://github.com/rust-lang/rust/issues/41960#issuecomment-464557423
2019-02-17Use more impl header lifetime elisionScott McMurray-1/+1
There are two big categories of changes in here - Removing lifetimes from common traits that can essentially never user a lifetime from an input (particularly `Drop` & `Debug`) - Forwarding impls that are only possible because the lifetime doesn't matter (like `impl<R: Read + ?Sized> Read for &mut R`) I omitted things that seemed like they could be more controversial, like the handful of iterators that have a `Item: 'static` despite the iterator having a lifetime or the `PartialEq` implementations where the flipped one cannot elide the lifetime.
2019-02-13Rollup merge of #58200 - RalfJung:str-as-mut-ptr, r=SimonSapinMazdak Farrokhzad-7/+23
fix str mutating through a ptr derived from &self Found by Miri: In `get_unchecked_mut` (also used by the checked variants internally) uses `str::as_ptr` to create a mutable reference, but `as_ptr` takes `&self`. This means the mutable references we return here got created from a shared reference, which violates the shared-references-are-read-only discipline! For this by using a newly introduced `as_mut_ptr` instead.
2019-02-12Auto merge of #58051 - SimonSapin:str_escape, r=alexcrichtonbors-114/+243
Stabilize str::escape_* methods with new return types… … that implement `Display` and `Iterator<Item=char>`, as proposed in FCP: https://github.com/rust-lang/rust/issues/27791#issuecomment-376864727
2019-02-12Add doctests for str::escape_*Simon Sapin-3/+93
2019-02-12Move str::escape_* to libcoreSimon Sapin-1/+125
2019-02-12Add internal impl_fn_for_zst macro for "named closure types"Simon Sapin-113/+28
2019-02-10libs: doc commentsAlexander Regueiro-9/+9
2019-02-10tests: doc commentsAlexander Regueiro-10/+10
2019-02-10rustc: doc commentsAlexander Regueiro-4/+4
2019-02-07also fix bad use of shared ref in split_at_mutRalf Jung-1/+1
2019-02-06add tracking issueRalf Jung-1/+1
2019-02-06remove now-unneeded raw ptr castsRalf Jung-3/+3
2019-02-05fix str mutating through a ptr derived from &selfRalf Jung-3/+19
2019-02-04update split docsgaryemerson-14/+14
Some confusion about split popped up at https://news.ycombinator.com/item?id=19080931 since the docs sorta sound like `&str`, `char` and closures are the only types that can be patterns. cc @steveklabnik
2019-02-01Stabilize split_ascii_whitespaceSimon Sapin-6/+5
Tracking issue FCP to merge: https://github.com/rust-lang/rust/issues/48656#issuecomment-442372750
2019-01-31Rollup merge of #58005 - vitiral:docs_trim_start_matches, r=ManishearthMazdak Farrokhzad-8/+8
update docs for fix_start/end_matches fixes #57686:
2019-01-31Rollup merge of #57106 - matthiaskrgr:trim_must_use, r=sfacklerMazdak Farrokhzad-0/+12
Mark str::trim.* functions as #[must_use]. The functions return a reference to a new object and do not modify in-place as the following code shows: ```` let s = String::from(" hello "); s.trim(); assert_eq!(s, " hello "); ```` The new reference should be bound to a variable as now indicated by #[must_use].