about summary refs log tree commit diff
path: root/src/libcore/str
AgeCommit message (Collapse)AuthorLines
2017-03-25Link ParseBoolError to from_str method of boolDonnie Bishop-1/+3
2017-03-22Rollup merge of #40722 - stjepang:doc-consistency-fixes, r=steveklabnikCorey Farwell-7/+7
Various fixes to wording consistency in the docs A bunch of random fixes, added punctuation, plurals, backticks, and so on... r? @steveklabnik
2017-03-22Tracking issue numbersSimonas Kazlauskas-10/+10
2017-03-22Checked (and unchecked) slicing for strings?Simonas Kazlauskas-23/+303
What is this magic‽
2017-03-22Various fixes to wording consistency in the docsStjepan Glavina-7/+7
2017-03-21str: Make docs consistently punctuatedSam Whited-1/+1
2017-03-20Auto merge of #40281 - jimmycuadra:try-from-from-str, r=aturonbors-2/+5
Rename TryFrom's associated type and implement str::parse using TryFrom. Per discussion on the tracking issue, naming `TryFrom`'s associated type `Error` is generally more consistent with similar traits in the Rust ecosystem, and what people seem to assume it should be called. It also helps disambiguate from `Result::Err`, the most common "Err". See https://github.com/rust-lang/rust/issues/33417#issuecomment-269108968. `TryFrom<&str>` and `FromStr` are equivalent, so have the latter provide the former to ensure that. Using `TryFrom` in the implementation of `str::parse` means types that implement either trait can use it. When we're ready to stabilize `TryFrom`, we should update `FromStr` to suggest implementing `TryFrom<&str>` instead for new code. See https://github.com/rust-lang/rust/issues/33417#issuecomment-277175994 and https://github.com/rust-lang/rust/issues/33417#issuecomment-277253827. Refs #33417.
2017-03-17Rollup merge of #40456 - frewsxcv:frewsxcv-docs-function-parens, ↵Corey Farwell-35/+35
r=GuillaumeGomez Remove function invokation parens from documentation links. This was never established as a convention we should follow in the 'More API Documentation Conventions' RFC: https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md
2017-03-15Rename TryFrom's associated type and implement str::parse using TryFrom.Jimmy Cuadra-2/+5
Per discussion on the tracking issue, naming `TryFrom`'s associated type `Error` is generally more consistent with similar traits in the Rust ecosystem, and what people seem to assume it should be called. It also helps disambiguate from `Result::Err`, the most common "Err". See https://github.com/rust-lang/rust/issues/33417#issuecomment-269108968. TryFrom<&str> and FromStr are equivalent, so have the latter provide the former to ensure that. Using TryFrom in the implementation of `str::parse` means types that implement either trait can use it. When we're ready to stabilize `TryFrom`, we should update `FromStr` to suggest implementing `TryFrom<&str>` instead for new code. See https://github.com/rust-lang/rust/issues/33417#issuecomment-277175994 and https://github.com/rust-lang/rust/issues/33417#issuecomment-277253827. Refs #33417.
2017-03-14Add tracking issue number for Utf8Error::error_lenSimon Sapin-1/+1
2017-03-14Replace Utf8Error::resume_from with Utf8Error::error_lenSimon Sapin-10/+12
Their relationship is: * `resume_from = error_len.map(|l| l + valid_up_to)` * error_len is always one of None, Some(1), Some(2), or Some(3). When I started using resume_from I almost always ended up subtracting valid_up_to to obtain error_len. Therefore the latter is what should be provided in the first place.
2017-03-14Add Utf8Error::resume_from, to help incremental and/or lossy decoding.Simon Sapin-22/+56
Without this, code outside of the standard library needs to reimplement most of the logic `from_utf8` to interpret the bytes after `valid_up_to()`.
2017-03-13Remove function invokation parens from documentation links.Corey Farwell-35/+35
This was never established as a convention we should follow in the 'More API Documentation Conventions' RFC: https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md
2017-03-01Only keep one copy of the UTF8_CHAR_WIDTH table.Simon Sapin-0/+7
… instead of one of each of libcore and libstd_unicode. Move the `utf8_char_width` function to `core::str` under the `str_internals` unstable feature.
2017-02-16Additional docs for Vec, String, and slice trait implsMatt Brubeck-0/+14
2017-02-09name anonymous fn parameters in libcore traitsTrevor Spiteri-3/+3
2017-01-12Auto merge of #37926 - bluss:from-utf8-small-simplification, r=sfacklerbors-27/+26
UTF-8 validation: Compute block end upfront Simplify the conditional used for ensuring that the whole word loop is only used if there are at least two whole words left to read. This makes the function slightly smaller and simpler, a 0-5% reduction in runtime for various test cases.
2017-01-03Auto merge of #38066 - bluss:string-slice-error, r=sfacklerbors-4/+22
Use more specific panic message for &str slicing errors Separate out of bounds errors from character boundary errors, and print more details for character boundary errors. It reports the first error it finds in: 1. begin out of bounds 2. end out of bounds 3. begin <= end violated 3. begin not char boundary 5. end not char boundary. Example: &"abcαβγ"[..4] thread 'str::test_slice_fail_boundary_1' panicked at 'byte index 4 is not a char boundary; it is inside 'α' (bytes 3..5) of `abcαβγ`' Fixes #38052
2016-12-04core: Forward ExactSizeIterator::is_empty for BytesUlrik Sverdrup-0/+5
2016-11-30Use more specific panic message for &str slicing errorsUlrik Sverdrup-4/+22
Separate out of bounds errors from character boundary errors, and print more details for character boundary errors. Example: &"abcαβγ"[..4] thread 'str::test_slice_fail_boundary_1' panicked at 'byte index 4 is not a char boundary; it is inside `α` (bytes 3..5) of `abcαβγ`'
2016-11-22utf8 validation: Cleanup code by renaming index variableUlrik Sverdrup-15/+15
2016-11-22utf8 validation: Cleanup code in the ascii fast pathUlrik Sverdrup-6/+4
2016-11-21utf8 validation: Compute block end upfrontUlrik Sverdrup-15/+16
Simplify the conditional used for ensuring that the whole word loop is only used if there are at least two whole words left to read. This makes the function slightly smaller and simpler, a 0-5% reduction in runtime for various test cases.
2016-11-20Auto merge of #37888 - bluss:chars-count, r=alexcrichtonbors-0/+16
Improve .chars().count() Use a simpler loop to count the `char` of a string: count the number of non-continuation bytes. Use `count += <conditional>` which the compiler understands well and can apply loop optimizations to. benchmark descriptions and results for two configurations: - ascii: ascii text - cy: cyrillic text - jp: japanese text - words ascii: counting each split_whitespace item from the ascii text - words jp: counting each split_whitespace item from the jp text ``` x86-64 rustc -Copt-level=3 name orig_ ns/iter cmov_ ns/iter diff ns/iter diff % count_ascii 1,453 (1755 MB/s) 1,398 (1824 MB/s) -55 -3.79% count_cy 5,990 (856 MB/s) 2,545 (2016 MB/s) -3,445 -57.51% count_jp 3,075 (1169 MB/s) 1,772 (2029 MB/s) -1,303 -42.37% count_words_ascii 4,157 (521 MB/s) 1,797 (1205 MB/s) -2,360 -56.77% count_words_jp 3,337 (1071 MB/s) 1,772 (2018 MB/s) -1,565 -46.90% x86-64 rustc -Ctarget-feature=+avx -Copt-level=3 name orig_ ns/iter cmov_ ns/iter diff ns/iter diff % count_ascii 1,444 (1766 MB/s) 763 (3343 MB/s) -681 -47.16% count_cy 5,871 (874 MB/s) 1,527 (3360 MB/s) -4,344 -73.99% count_jp 2,874 (1251 MB/s) 1,073 (3351 MB/s) -1,801 -62.67% count_words_ascii 4,131 (524 MB/s) 1,871 (1157 MB/s) -2,260 -54.71% count_words_jp 3,253 (1099 MB/s) 1,331 (2686 MB/s) -1,922 -59.08% ``` I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time), but the code in this PR was always winning `count_words_ascii` in particular (counting many small strings); this solution is an improvement without tradeoffs.
2016-11-20Optimise CharIndices::last()Oliver Middleton-0/+6
The default implementation of last() goes through the entire iterator but that's not needed here.
2016-11-19str: Improve .chars().count()Ulrik Sverdrup-0/+16
Use a simpler loop to count the `char` of a string: count the number of non-continuation bytes. Use `count += <conditional>` which the compiler understands well and can apply loop optimizations to.
2016-11-19Optimise Chars::last()Oliver Middleton-0/+6
The default implementation of last() goes through the entire iterator but that's not needed here.
2016-09-30str: Fix documentation typoDavid Henningsson-1/+1
from_utf8 returns a Result, not an Option. Signed-off-by: David Henningsson <diwic@ubuntu.com>
2016-09-22Rollup merge of #36423 - GuillaumeGomez:eq_impl, r=pnkfelixJonathan Turner-1/+1
Add missing Eq implementations Part of #36301.
2016-09-18Add missing Eq implementationsGuillaume Gomez-1/+1
2016-09-11Documentation of what does for each typeathulappadan-0/+1
2016-08-24Use `#[prelude_import]` in `libcore`.Jeffrey Seyfried-16/+4
2016-08-23Rollup merge of #35910 - tbu-:pr_weird_linebreak, r=alexcrichtonGuillaume Gomez-2/+3
Change a weird line break in `core::str`
2016-08-23Change a weird line break in `core::str`Tobias Bucher-2/+3
2016-08-18Add a FusedIterator trait.Steven Allen-1/+24
This trait can be used to avoid the overhead of a fuse wrapper when an iterator is already well-behaved. Conforming to: RFC 1581 Closes: #35602
2016-07-28Add documentation example for `str::Chars::as_str`.Corey Farwell-0/+13
2016-06-24Auto merge of #34425 - tbu-:pr_len_instead_of_size_hint, r=alexcrichtonbors-9/+8
Use `len` instead of `size_hint` where appropiate This makes it clearer that we're not just looking for a lower bound but rather know that the iterator is an `ExactSizeIterator`.
2016-06-23std: Fix up stabilization discrepanciesAlex Crichton-16/+0
* Remove the deprecated `CharRange` type which was forgotten to be removed awhile back. * Stabilize the `os::$platform::raw::pthread_t` type which was intended to be stabilized as part of #32804
2016-06-23Use `len` instead of `size_hint` where appropiateTobias Bucher-9/+8
This makes it clearer that we're not just looking for a lower bound but rather know that the iterator is an `ExactSizeIterator`.
2016-06-01Auto merge of #33853 - alexcrichton:remove-deprecated, r=aturonbors-129/+1
std: Clean out old unstable + deprecated APIs These should all have been deprecated for at least one cycle, so this commit cleans them all out.
2016-05-30std: Clean out old unstable + deprecated APIsAlex Crichton-129/+1
These should all have been deprecated for at least one cycle, so this commit cleans them all out.
2016-05-27make core::str::next_code_point work on arbitrary iteratorM Farkas-Dyck-2/+3
2016-04-16Auto merge of #32909 - sanxiyn:unused-trait-import-2, r=alexcrichtonbors-2/+0
Remove unused trait imports
2016-04-12Remove unused trait importsSeo Sanghyeon-2/+0
2016-04-12Auto merge of #32804 - alexcrichton:stabilize-1.9, r=brsonbors-16/+23
std: Stabilize APIs for the 1.9 release This commit applies all stabilizations, renamings, and deprecations that the library team has decided on for the upcoming 1.9 release. All tracking issues have gone through a cycle-long "final comment period" and the specific APIs stabilized/deprecated are: Stable * `std::panic` * `std::panic::catch_unwind` (renamed from `recover`) * `std::panic::resume_unwind` (renamed from `propagate`) * `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`) * `std::panic::UnwindSafe` (renamed from `RecoverSafe`) * `str::is_char_boundary` * `<*const T>::as_ref` * `<*mut T>::as_ref` * `<*mut T>::as_mut` * `AsciiExt::make_ascii_uppercase` * `AsciiExt::make_ascii_lowercase` * `char::decode_utf16` * `char::DecodeUtf16` * `char::DecodeUtf16Error` * `char::DecodeUtf16Error::unpaired_surrogate` * `BTreeSet::take` * `BTreeSet::replace` * `BTreeSet::get` * `HashSet::take` * `HashSet::replace` * `HashSet::get` * `OsString::with_capacity` * `OsString::clear` * `OsString::capacity` * `OsString::reserve` * `OsString::reserve_exact` * `OsStr::is_empty` * `OsStr::len` * `std::os::unix::thread` * `RawPthread` * `JoinHandleExt` * `JoinHandleExt::as_pthread_t` * `JoinHandleExt::into_pthread_t` * `HashSet::hasher` * `HashMap::hasher` * `CommandExt::exec` * `File::try_clone` * `SocketAddr::set_ip` * `SocketAddr::set_port` * `SocketAddrV4::set_ip` * `SocketAddrV4::set_port` * `SocketAddrV6::set_ip` * `SocketAddrV6::set_port` * `SocketAddrV6::set_flowinfo` * `SocketAddrV6::set_scope_id` * `<[T]>::copy_from_slice` * `ptr::read_volatile` * `ptr::write_volatile` * The `#[deprecated]` attribute * `OpenOptions::create_new` Deprecated * `std::raw::Slice` - use raw parts of `slice` module instead * `std::raw::Repr` - use raw parts of `slice` module instead * `str::char_range_at` - use slicing plus `chars()` plus `len_utf8` * `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8` * `str::char_at` - use slicing plus `chars()` * `str::char_at_reverse` - use slicing plus `chars().rev()` * `str::slice_shift_char` - use `chars()` plus `Chars::as_str` * `CommandExt::session_leader` - use `before_exec` instead. Closes #27719 cc #27751 (deprecating the `Slice` bits) Closes #27754 Closes #27780 Closes #27809 Closes #27811 Closes #27830 Closes #28050 Closes #29453 Closes #29791 Closes #29935 Closes #30014 Closes #30752 Closes #31262 cc #31398 (still need to deal with `before_exec`) Closes #31405 Closes #31572 Closes #31755 Closes #31756
2016-04-11std: Stabilize APIs for the 1.9 releaseAlex Crichton-16/+23
This commit applies all stabilizations, renamings, and deprecations that the library team has decided on for the upcoming 1.9 release. All tracking issues have gone through a cycle-long "final comment period" and the specific APIs stabilized/deprecated are: Stable * `std::panic` * `std::panic::catch_unwind` (renamed from `recover`) * `std::panic::resume_unwind` (renamed from `propagate`) * `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`) * `std::panic::UnwindSafe` (renamed from `RecoverSafe`) * `str::is_char_boundary` * `<*const T>::as_ref` * `<*mut T>::as_ref` * `<*mut T>::as_mut` * `AsciiExt::make_ascii_uppercase` * `AsciiExt::make_ascii_lowercase` * `char::decode_utf16` * `char::DecodeUtf16` * `char::DecodeUtf16Error` * `char::DecodeUtf16Error::unpaired_surrogate` * `BTreeSet::take` * `BTreeSet::replace` * `BTreeSet::get` * `HashSet::take` * `HashSet::replace` * `HashSet::get` * `OsString::with_capacity` * `OsString::clear` * `OsString::capacity` * `OsString::reserve` * `OsString::reserve_exact` * `OsStr::is_empty` * `OsStr::len` * `std::os::unix::thread` * `RawPthread` * `JoinHandleExt` * `JoinHandleExt::as_pthread_t` * `JoinHandleExt::into_pthread_t` * `HashSet::hasher` * `HashMap::hasher` * `CommandExt::exec` * `File::try_clone` * `SocketAddr::set_ip` * `SocketAddr::set_port` * `SocketAddrV4::set_ip` * `SocketAddrV4::set_port` * `SocketAddrV6::set_ip` * `SocketAddrV6::set_port` * `SocketAddrV6::set_flowinfo` * `SocketAddrV6::set_scope_id` * `<[T]>::copy_from_slice` * `ptr::read_volatile` * `ptr::write_volatile` * The `#[deprecated]` attribute * `OpenOptions::create_new` Deprecated * `std::raw::Slice` - use raw parts of `slice` module instead * `std::raw::Repr` - use raw parts of `slice` module instead * `str::char_range_at` - use slicing plus `chars()` plus `len_utf8` * `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8` * `str::char_at` - use slicing plus `chars()` * `str::char_at_reverse` - use slicing plus `chars().rev()` * `str::slice_shift_char` - use `chars()` plus `Chars::as_str` * `CommandExt::session_leader` - use `before_exec` instead. Closes #27719 cc #27751 (deprecating the `Slice` bits) Closes #27754 Closes #27780 Closes #27809 Closes #27811 Closes #27830 Closes #28050 Closes #29453 Closes #29791 Closes #29935 Closes #30014 Closes #30752 Closes #31262 cc #31398 (still need to deal with `before_exec`) Closes #31405 Closes #31572 Closes #31755 Closes #31756
2016-04-09Bit-magic for faster is_char_boundaryRaph Levien-1/+2
The asm generated for b < 128 || b >= 192 is not ideal, as it computes both sub-inequalities. This patch replaces it with bit magic. Fixes #32471
2016-04-05Specialize equality for [T] and comparison for [u8]Ulrik Sverdrup-22/+3
Where T is a type that can be compared for equality bytewise, we can use memcmp. We can also use memcmp for PartialOrd, Ord for [u8] and by extension &str. This is an improvement for example for the comparison [u8] == [u8] that used to emit a loop that compared the slices byte by byte. One worry here could be that this introduces function calls to memcmp in contexts where it should really inline the comparison or even optimize it out, but llvm takes care of recognizing memcmp specifically.
2016-03-26Rollup merge of #32456 - bluss:str-zero, r=alexcrichtonManish Goregaokar-1/+5
Hardcode accepting 0 as a valid str char boundary If we check explicitly for index == 0, that removes the need to read the byte at index 0, so it avoids a trip to the string's memory, and it optimizes out the slicing index' bounds check whenever it is (a constant) zero.
2016-03-24Accept 0 as a valid str char boundaryUlrik Sverdrup-1/+4
Index 0 must be a valid char boundary (invariant of str that it contains valid UTF-8 data). If we check explicitly for index == 0, that removes the need to read the byte at index 0, so it avoids a trip to the string's memory, and it optimizes out the slicing index' bounds check whenever it is zero. With this change, the following examples all change from having a read of the byte at 0 and a branch to possibly panicing, to having the bounds checking optimized away. ```rust pub fn split(s: &str) -> (&str, &str) { s.split_at(0) } pub fn both(s: &str) -> &str { &s[0..s.len()] } pub fn first(s: &str) -> &str { &s[..0] } pub fn last(s: &str) -> &str { &s[0..] } ```