summary refs log tree commit diff
path: root/src/libcore/str
AgeCommit message (Collapse)AuthorLines
2016-01-16std: Stabilize APIs for the 1.7 releaseAlex Crichton-7/+7
This commit stabilizes and deprecates the FCP (final comment period) APIs for the upcoming 1.7 beta release. The specific APIs which changed were: Stabilized * `Path::strip_prefix` (renamed from `relative_from`) * `path::StripPrefixError` (new error type returned from `strip_prefix`) * `Ipv4Addr::is_loopback` * `Ipv4Addr::is_private` * `Ipv4Addr::is_link_local` * `Ipv4Addr::is_multicast` * `Ipv4Addr::is_broadcast` * `Ipv4Addr::is_documentation` * `Ipv6Addr::is_unspecified` * `Ipv6Addr::is_loopback` * `Ipv6Addr::is_unique_local` * `Ipv6Addr::is_multicast` * `Vec::as_slice` * `Vec::as_mut_slice` * `String::as_str` * `String::as_mut_str` * `<[T]>::clone_from_slice` - the `usize` return value is removed * `<[T]>::sort_by_key` * `i32::checked_rem` (and other signed types) * `i32::checked_neg` (and other signed types) * `i32::checked_shl` (and other signed types) * `i32::checked_shr` (and other signed types) * `i32::saturating_mul` (and other signed types) * `i32::overflowing_add` (and other signed types) * `i32::overflowing_sub` (and other signed types) * `i32::overflowing_mul` (and other signed types) * `i32::overflowing_div` (and other signed types) * `i32::overflowing_rem` (and other signed types) * `i32::overflowing_neg` (and other signed types) * `i32::overflowing_shl` (and other signed types) * `i32::overflowing_shr` (and other signed types) * `u32::checked_rem` (and other unsigned types) * `u32::checked_neg` (and other unsigned types) * `u32::checked_shl` (and other unsigned types) * `u32::saturating_mul` (and other unsigned types) * `u32::overflowing_add` (and other unsigned types) * `u32::overflowing_sub` (and other unsigned types) * `u32::overflowing_mul` (and other unsigned types) * `u32::overflowing_div` (and other unsigned types) * `u32::overflowing_rem` (and other unsigned types) * `u32::overflowing_neg` (and other unsigned types) * `u32::overflowing_shl` (and other unsigned types) * `u32::overflowing_shr` (and other unsigned types) * `ffi::IntoStringError` * `CString::into_string` * `CString::into_bytes` * `CString::into_bytes_with_nul` * `From<CString> for Vec<u8>` * `From<CString> for Vec<u8>` * `IntoStringError::into_cstring` * `IntoStringError::utf8_error` * `Error for IntoStringError` Deprecated * `Path::relative_from` - renamed to `strip_prefix` * `Path::prefix` - use `components().next()` instead * `os::unix::fs` constants - moved to the `libc` crate * `fmt::{radix, Radix, RadixFmt}` - not used enough to stabilize * `IntoCow` - conflicts with `Into` and may come back later * `i32::{BITS, BYTES}` (and other integers) - not pulling their weight * `DebugTuple::formatter` - will be removed * `sync::Semaphore` - not used enough and confused with system semaphores Closes #23284 cc #27709 (still lots more methods though) Closes #27712 Closes #27722 Closes #27728 Closes #27735 Closes #27729 Closes #27755 Closes #27782 Closes #27798
2016-01-14UTF-8 validation: Add missing if conditional for short inputUlrik Sverdrup-13/+16
We need to guard that `len` is large enough for the fast skip loop.
2016-01-12Add fast path for ASCII in UTF-8 validationUlrik Sverdrup-27/+57
This speeds up the ascii case (and long stretches of ascii in otherwise mixed UTF-8 data) when checking UTF-8 validity. Benchmark results suggest that on purely ASCII input, we can improve throughput (megabytes verified / second) by a factor of 13 to 14! On xml and mostly english language input (en.wikipedia xml dump), throughput increases by a factor 7. On mostly non-ASCII input, performance increases slightly or is the same. The UTF-8 validation is rewritten to use indexed access; since all access is preceded by a (mandatory for validation) length check, they are statically elided by llvm and this formulation is in fact the best for performance. A previous version had losses due to slice to iterator conversions. A large credit to Björn Steinbrink who improved this patch immensely, writing this second version. Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3. Old code is `regular`, this PR is called `fast`. Datasets: - `ascii` is just ascii (2.5 kB) - `cyr` is cyrillic script with ascii spaces (5 kB) - `dewik10` is 10MB of a de.wikipedia xml dump - `enwik10` is 100MB of an en.wikipedia xml dump - `jawik10` is 10MB of a ja.wikipedia xml dump ``` test from_utf8_ascii_fast ... bench: 140 ns/iter (+/- 4) = 18221 MB/s test from_utf8_ascii_regular ... bench: 1,932 ns/iter (+/- 19) = 1320 MB/s test from_utf8_cyr_fast ... bench: 10,025 ns/iter (+/- 245) = 511 MB/s test from_utf8_cyr_regular ... bench: 12,250 ns/iter (+/- 437) = 418 MB/s test from_utf8_dewik10_fast ... bench: 6,017,909 ns/iter (+/- 105,755) = 1740 MB/s test from_utf8_dewik10_regular ... bench: 11,669,493 ns/iter (+/- 264,045) = 891 MB/s test from_utf8_enwik8_fast ... bench: 14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s test from_utf8_enwik8_regular ... bench: 93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s test from_utf8_jawik10_fast ... bench: 29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s test from_utf8_jawik10_regular ... bench: 29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s ``` Co-authored-by: Björn Steinbrink <bsteinbr@gmail.com>
2015-12-11Add missing links to str docsOri Avtalion-14/+42
2015-12-05std: Stabilize APIs for the 1.6 releaseAlex Crichton-49/+67
This commit is the standard API stabilization commit for the 1.6 release cycle. The list of issues and APIs below have all been through their cycle-long FCP and the libs team decisions are listed below Stabilized APIs * `Read::read_exact` * `ErrorKind::UnexpectedEof` (renamed from `UnexpectedEOF`) * libcore -- this was a bit of a nuanced stabilization, the crate itself is now marked as `#[stable]` and the methods appearing via traits for primitives like `char` and `str` are now also marked as stable. Note that the extension traits themeselves are marked as unstable as they're imported via the prelude. The `try!` macro was also moved from the standard library into libcore to have the same interface. Otherwise the functions all have copied stability from the standard library now. * The `#![no_std]` attribute * `fs::DirBuilder` * `fs::DirBuilder::new` * `fs::DirBuilder::recursive` * `fs::DirBuilder::create` * `os::unix::fs::DirBuilderExt` * `os::unix::fs::DirBuilderExt::mode` * `vec::Drain` * `vec::Vec::drain` * `string::Drain` * `string::String::drain` * `vec_deque::Drain` * `vec_deque::VecDeque::drain` * `collections::hash_map::Drain` * `collections::hash_map::HashMap::drain` * `collections::hash_set::Drain` * `collections::hash_set::HashSet::drain` * `collections::binary_heap::Drain` * `collections::binary_heap::BinaryHeap::drain` * `Vec::extend_from_slice` (renamed from `push_all`) * `Mutex::get_mut` * `Mutex::into_inner` * `RwLock::get_mut` * `RwLock::into_inner` * `Iterator::min_by_key` (renamed from `min_by`) * `Iterator::max_by_key` (renamed from `max_by`) Deprecated APIs * `ErrorKind::UnexpectedEOF` (renamed to `UnexpectedEof`) * `OsString::from_bytes` * `OsStr::to_cstring` * `OsStr::to_bytes` * `fs::walk_dir` and `fs::WalkDir` * `path::Components::peek` * `slice::bytes::MutableByteVector` * `slice::bytes::copy_memory` * `Vec::push_all` (renamed to `extend_from_slice`) * `Duration::span` * `IpAddr` * `SocketAddr::ip` * `Read::tee` * `io::Tee` * `Write::broadcast` * `io::Broadcast` * `Iterator::min_by` (renamed to `min_by_key`) * `Iterator::max_by` (renamed to `max_by_key`) * `net::lookup_addr` New APIs (still unstable) * `<[T]>::sort_by_key` (added to mirror `min_by_key`) Closes #27585 Closes #27704 Closes #27707 Closes #27710 Closes #27711 Closes #27727 Closes #27740 Closes #27744 Closes #27799 Closes #27801 cc #27801 (doesn't close as `Chars` is still unstable) Closes #28968
2015-11-23Auto merge of #29952 - petrochenkov:depr, r=brsonbors-1/+1
Part of https://github.com/rust-lang/rust/issues/29935 The deprecation lint is still called "deprecated", so people can continue using `#[allow(deprecated)]` and similar things.
2015-11-23Mark slice_error_fail as a cold pathManish Goregaokar-0/+1
2015-11-20Rename #[deprecated] to #[rustc_deprecated]Vadim Petrochenkov-1/+1
2015-11-20Add information about str::parse() in FromStr docsDevon Hollowood-0/+7
2015-11-18Add missing annotations and some testsVadim Petrochenkov-0/+3
2015-11-12libcore: deny warnings in doctestsKevin Butler-2/+0
2015-11-06Remove stability annotations from trait impl itemsVadim Petrochenkov-1/+0
Remove `stable` stability annotations from inherent impls
2015-10-25Auto merge of #29254 - alexcrichton:stabilize-1.5, r=brsonbors-5/+2
This commit stabilizes and deprecates library APIs whose FCP has closed in the last cycle, specifically: Stabilized APIs: * `fs::canonicalize` * `Path::{metadata, symlink_metadata, canonicalize, read_link, read_dir, exists, is_file, is_dir}` - all moved to inherent methods from the `PathExt` trait. * `Formatter::fill` * `Formatter::width` * `Formatter::precision` * `Formatter::sign_plus` * `Formatter::sign_minus` * `Formatter::alternate` * `Formatter::sign_aware_zero_pad` * `string::ParseError` * `Utf8Error::valid_up_to` * `Iterator::{cmp, partial_cmp, eq, ne, lt, le, gt, ge}` * `<[T]>::split_{first,last}{,_mut}` * `Condvar::wait_timeout` - note that `wait_timeout_ms` is not yet deprecated but will be once 1.5 is released. * `str::{R,}MatchIndices` * `str::{r,}match_indices` * `char::from_u32_unchecked` * `VecDeque::insert` * `VecDeque::shrink_to_fit` * `VecDeque::as_slices` * `VecDeque::as_mut_slices` * `VecDeque::swap_remove_front` - (renamed from `swap_front_remove`) * `VecDeque::swap_remove_back` - (renamed from `swap_back_remove`) * `Vec::resize` * `str::slice_mut_unchecked` * `FileTypeExt` * `FileTypeExt::{is_block_device, is_char_device, is_fifo, is_socket}` * `BinaryHeap::from` - `from_vec` deprecated in favor of this * `BinaryHeap::into_vec` - plus a `Into` impl * `BinaryHeap::into_sorted_vec` Deprecated APIs * `slice::ref_slice` * `slice::mut_ref_slice` * `iter::{range_inclusive, RangeInclusive}` * `std::dynamic_lib` Closes #27706 Closes #27725 cc #27726 (align not stabilized yet) Closes #27734 Closes #27737 Closes #27742 Closes #27743 Closes #27772 Closes #27774 Closes #27777 Closes #27781 cc #27788 (a few remaining methods though) Closes #27790 Closes #27793 Closes #27796 Closes #27810 cc #28147 (not all parts stabilized)
2015-10-25std: Stabilize library APIs for 1.5Alex Crichton-5/+2
This commit stabilizes and deprecates library APIs whose FCP has closed in the last cycle, specifically: Stabilized APIs: * `fs::canonicalize` * `Path::{metadata, symlink_metadata, canonicalize, read_link, read_dir, exists, is_file, is_dir}` - all moved to inherent methods from the `PathExt` trait. * `Formatter::fill` * `Formatter::width` * `Formatter::precision` * `Formatter::sign_plus` * `Formatter::sign_minus` * `Formatter::alternate` * `Formatter::sign_aware_zero_pad` * `string::ParseError` * `Utf8Error::valid_up_to` * `Iterator::{cmp, partial_cmp, eq, ne, lt, le, gt, ge}` * `<[T]>::split_{first,last}{,_mut}` * `Condvar::wait_timeout` - note that `wait_timeout_ms` is not yet deprecated but will be once 1.5 is released. * `str::{R,}MatchIndices` * `str::{r,}match_indices` * `char::from_u32_unchecked` * `VecDeque::insert` * `VecDeque::shrink_to_fit` * `VecDeque::as_slices` * `VecDeque::as_mut_slices` * `VecDeque::swap_remove_front` - (renamed from `swap_front_remove`) * `VecDeque::swap_remove_back` - (renamed from `swap_back_remove`) * `Vec::resize` * `str::slice_mut_unchecked` * `FileTypeExt` * `FileTypeExt::{is_block_device, is_char_device, is_fifo, is_socket}` * `BinaryHeap::from` - `from_vec` deprecated in favor of this * `BinaryHeap::into_vec` - plus a `Into` impl * `BinaryHeap::into_sorted_vec` Deprecated APIs * `slice::ref_slice` * `slice::mut_ref_slice` * `iter::{range_inclusive, RangeInclusive}` * `std::dynamic_lib` Closes #27706 Closes #27725 cc #27726 (align not stabilized yet) Closes #27734 Closes #27737 Closes #27742 Closes #27743 Closes #27772 Closes #27774 Closes #27777 Closes #27781 cc #27788 (a few remaining methods though) Closes #27790 Closes #27793 Closes #27796 Closes #27810 cc #28147 (not all parts stabilized)
2015-10-23Unsafety -> Safety in doc headingsSteve Klabnik-1/+1
Follow https://doc.rust-lang.org/book/documentation.html#special-sections
2015-10-13Correct spelling in docsAndrew Paseltiner-1/+1
2015-10-10Trivial typo fix: from_utrf8 should be from_utf8Andrew Chin-1/+1
2015-10-08typos: fix a grabbag of typos all over the placeCristi Cobzarenco-1/+1
2015-10-02Improve documentation for the from_utf8 familySteve Klabnik-5/+119
Our docs were very basic for the various versions of from_utf8, so this commit beefs them up. It also improves docs for the &str variant's error, Utf8Error.
2015-09-30Add some docs for FromString::from_strSteve Klabnik-0/+15
@marchelzo pointed out on IRC that this doesn't have docs, so, let's change that.
2015-09-26Auto merge of #28632 - alexcrichton:update-match-indices, r=Kimundibors-5/+9
This commit updates the `MatchIndices` and `RMatchIndices` iterators to follow the same pattern as the `chars` and `char_indices` iterators. The `matches` iterator currently yield `&str` elements, so the `MatchIndices` iterator now yields the index of the match as well as the `&str` that matched (instead of start/end indexes). cc #27743
2015-09-25std: Update MatchIndices to return a subsliceAlex Crichton-5/+9
This commit updates the `MatchIndices` and `RMatchIndices` iterators to follow the same pattern as the `chars` and `char_indices` iterators. The `matches` iterator currently yield `&str` elements, so the `MatchIndices` iterator now yields the index of the match as well as the `&str` that matched (instead of start/end indexes). cc #27743
2015-09-24Utf8Error::valid_up_to: make documented semantics more precise/usefulSimon Sapin-2/+2
2015-09-13Auto merge of #28339 - alexcrichton:stabilize-1.4, r=aturonbors-2/+2
The FCP is coming to a close and 1.4 is coming out soon, so this brings in the libs team decision for all library features this cycle. Stabilized APIs: * `<Box<str>>::into_string` * `Arc::downgrade` * `Arc::get_mut` * `Arc::make_mut` * `Arc::try_unwrap` * `Box::from_raw` * `Box::into_raw` * `CStr::to_str` * `CStr::to_string_lossy` * `CString::from_raw` * `CString::into_raw` * `IntoRawFd::into_raw_fd` * `IntoRawFd` * `IntoRawHandle::into_raw_handle` * `IntoRawHandle` * `IntoRawSocket::into_raw_socket` * `IntoRawSocket` * `Rc::downgrade` * `Rc::get_mut` * `Rc::make_mut` * `Rc::try_unwrap` * `Result::expect` * `String::into_boxed_slice` * `TcpSocket::read_timeout` * `TcpSocket::set_read_timeout` * `TcpSocket::set_write_timeout` * `TcpSocket::write_timeout` * `UdpSocket::read_timeout` * `UdpSocket::set_read_timeout` * `UdpSocket::set_write_timeout` * `UdpSocket::write_timeout` * `Vec::append` * `Vec::split_off` * `VecDeque::append` * `VecDeque::retain` * `VecDeque::split_off` * `rc::Weak::upgrade` * `rc::Weak` * `slice::Iter::as_slice` * `slice::IterMut::into_slice` * `str::CharIndices::as_str` * `str::Chars::as_str` * `str::split_at_mut` * `str::split_at` * `sync::Weak::upgrade` * `sync::Weak` * `thread::park_timeout` * `thread::sleep` Deprecated APIs * `BTreeMap::with_b` * `BTreeSet::with_b` * `Option::as_mut_slice` * `Option::as_slice` * `Result::as_mut_slice` * `Result::as_slice` * `f32::from_str_radix` * `f64::from_str_radix` Closes #27277 Closes #27718 Closes #27736 Closes #27764 Closes #27765 Closes #27766 Closes #27767 Closes #27768 Closes #27769 Closes #27771 Closes #27773 Closes #27775 Closes #27776 Closes #27785 Closes #27792 Closes #27795 Closes #27797
2015-09-11std: Stabilize/deprecate features for 1.4Alex Crichton-2/+2
The FCP is coming to a close and 1.4 is coming out soon, so this brings in the libs team decision for all library features this cycle. Stabilized APIs: * `<Box<str>>::into_string` * `Arc::downgrade` * `Arc::get_mut` * `Arc::make_mut` * `Arc::try_unwrap` * `Box::from_raw` * `Box::into_raw` * `CStr::to_str` * `CStr::to_string_lossy` * `CString::from_raw` * `CString::into_raw` * `IntoRawFd::into_raw_fd` * `IntoRawFd` * `IntoRawHandle::into_raw_handle` * `IntoRawHandle` * `IntoRawSocket::into_raw_socket` * `IntoRawSocket` * `Rc::downgrade` * `Rc::get_mut` * `Rc::make_mut` * `Rc::try_unwrap` * `Result::expect` * `String::into_boxed_slice` * `TcpSocket::read_timeout` * `TcpSocket::set_read_timeout` * `TcpSocket::set_write_timeout` * `TcpSocket::write_timeout` * `UdpSocket::read_timeout` * `UdpSocket::set_read_timeout` * `UdpSocket::set_write_timeout` * `UdpSocket::write_timeout` * `Vec::append` * `Vec::split_off` * `VecDeque::append` * `VecDeque::retain` * `VecDeque::split_off` * `rc::Weak::upgrade` * `rc::Weak` * `slice::Iter::as_slice` * `slice::IterMut::into_slice` * `str::CharIndices::as_str` * `str::Chars::as_str` * `str::split_at_mut` * `str::split_at` * `sync::Weak::upgrade` * `sync::Weak` * `thread::park_timeout` * `thread::sleep` Deprecated APIs * `BTreeMap::with_b` * `BTreeSet::with_b` * `Option::as_mut_slice` * `Option::as_slice` * `Result::as_mut_slice` * `Result::as_slice` * `f32::from_str_radix` * `f64::from_str_radix` Closes #27277 Closes #27718 Closes #27736 Closes #27764 Closes #27765 Closes #27766 Closes #27767 Closes #27768 Closes #27769 Closes #27771 Closes #27773 Closes #27775 Closes #27776 Closes #27785 Closes #27792 Closes #27795 Closes #27797
2015-09-10Optimize string comparison by using memcmpErick Tryzelaar-15/+20
llvm seems to be having some trouble optimizing the iterator-based string comparsion method into some equivalent to memcmp. This explicitly calls out to the memcmp intrinisic in order to allow llvm to generate better code. In some manual benchmarking, this memcmp-based approach is 20 times faster than the iterator approach.
2015-09-08some more clippy-based improvementsAndre Bogus-3/+3
2015-09-04Auto merge of #28119 - nagisa:bytesderef, r=alexcrichtonbors-30/+3
2015-09-03std: Account for CRLF in {str, BufRead}::linesAlex Crichton-4/+10
This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of the `str::lines` and `BufRead::lines` iterators. Both iterators now account for `\r\n` sequences in addition to `\n`, allowing for less surprising behavior across platforms (especially in the `BufRead` case). Splitting *only* on the `\n` character can still be achieved with `split('\n')` in both cases. The `str::lines_any` function is also now deprecated as `str::lines` is a drop-in replacement for it. [rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.md Closes #28032
2015-09-03Elide lifetimes in libcoreManish Goregaokar-11/+11
2015-08-31Change explicit BytesDeref impl into Cloned iteratorSimonas Kazlauskas-30/+3
2015-08-31Auto merge of #28101 - ijks:24214-str-bytes, r=alexcrichtonbors-0/+15
Specifically, `count`, `last`, and `nth` are implemented to use the methods of the underlying slice iterator. Partially closes #24214.
2015-08-30Add overrides to iterator methods for `str::Bytes`Daan Rijks-0/+15
Specifically, `count`, `last`, and `nth` are implemented to use the methods of the underlying slice iterator. Partially closes #24214.
2015-08-28Add .as_str() to str::Chars and str::CharIndices. See #27775.Simon Sapin-0/+24
2015-08-18Auto merge of #27474 - bluss:twoway-reverse, r=brsonbors-64/+193
StrSearcher: Implement the complete reverse case for the two way algorithm Fix quadratic behavior in StrSearcher in reverse search with periodic needles. This commit adds the missing pieces for the "short period" case in reverse search. The short case will show up when the needle is literally periodic, for example "abababab". Two way uses a "critical factorization" of the needle: x = u v. Searching matches v first, if mismatch at character k, skip k forward. Matching u, if mismatch, skip period(x) forward. To avoid O(mn) behavior after mismatch in u, memorize the already matched prefix. The short period case requires that |u| < period(x). For the reverse search we need to compute a different critical factorization x = u' v' where |v'| < period(x), because we are searching for the reversed needle. A short v' also benefits the algorithm in general. The reverse critical factorization is computed quickly by using the same maximal suffix algorithm, but terminating as soon as we have a location with local period equal to period(x). This adds extra fields crit_pos_back and memory_back for the reverse case. The new overhead for TwoWaySearcher::new is low, and additionally I think the "short period" case is uncommon in many applications of string search. The maximal_suffix methods were updated in documentation and the algorithms updated to not use !0 and wrapping add, variable left is now 1 larger, offset 1 smaller. Use periodicity when computing byteset: in the periodic case, just iterate over one period instead of the whole needle. Example before (rfind) after (twoway_rfind) benchmark shows the removal of quadratic behavior. needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100 ``` test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s ```
2015-08-16StrSearcher: Additional comments and small code movesUlrik Sverdrup-19/+27
Break out a separate static method to create the "byteset".
2015-08-15core: Fill out issues for unstable featuresAlex Crichton-6/+11
2015-08-12Fallout in libs -- misc missing bounds uncovered by WF checks.Niko Matsakis-1/+2
2015-08-09Make `str::as_bytes_mut` privateTobias Bucher-6/+0
2015-08-09Replace many uses of `mem::transmute` with more specific functionsTobias Bucher-8/+14
The replacements are functions that usually use a single `mem::transmute` in their body and restrict input and output via more concrete types than `T` and `U`. Worth noting are the `transmute` functions for slices and the `from_utf8*` family for mutable slices. Additionally, `mem::transmute` was often used for casting raw pointers, when you can already cast raw pointers just fine with `as`.
2015-08-07StrSearcher: Improve inner loop in TwoWaySearcher::next, next_backUlrik Sverdrup-10/+22
The innermost loop of TwoWaySearcher checks the boundary of the haystack vs position + needle.len(), and it checks the last byte of the needle against the byteset. If these two steps are combined by using the indexing of the last needle byte's position as bounds check, the algorithm improves its throughput. We improve the innermost loop by reducing the number of instructions used, and elminating the panic case for the checked indexing that was previously used. Selected benchmarks from the external/workspace testsuite. Benchmarks improve across the board. ``` before: test bb_in_aa::twoway_find ... bench: 4,229 ns/iter (+/- 1,305) = 23646 MB/s test bb_in_aa::twoway_rfind ... bench: 3,873 ns/iter (+/- 101) = 25819 MB/s test short_1let_long::twoway_find ... bench: 7,075 ns/iter (+/- 29) = 360 MB/s test short_1let_long::twoway_rfind ... bench: 6,640 ns/iter (+/- 79) = 384 MB/s test short_2let_long::twoway_find ... bench: 3,823 ns/iter (+/- 16) = 667 MB/s test short_2let_long::twoway_rfind ... bench: 3,774 ns/iter (+/- 44) = 675 MB/s test short_3let_long::twoway_find ... bench: 3,582 ns/iter (+/- 47) = 712 MB/s test short_3let_long::twoway_rfind ... bench: 3,616 ns/iter (+/- 34) = 705 MB/s with this commit: test bb_in_aa::twoway_find ... bench: 2,952 ns/iter (+/- 20) = 33875 MB/s test bb_in_aa::twoway_rfind ... bench: 2,939 ns/iter (+/- 99) = 34025 MB/s test short_1let_long::twoway_find ... bench: 4,593 ns/iter (+/- 4) = 555 MB/s test short_1let_long::twoway_rfind ... bench: 4,592 ns/iter (+/- 76) = 555 MB/s test short_2let_long::twoway_find ... bench: 2,804 ns/iter (+/- 3) = 909 MB/s test short_2let_long::twoway_rfind ... bench: 2,807 ns/iter (+/- 40) = 908 MB/s test short_3let_long::twoway_find ... bench: 3,105 ns/iter (+/- 120) = 821 MB/s test short_3let_long::twoway_rfind ... bench: 3,019 ns/iter (+/- 50) = 844 MB/s ``` - `bb_in_aa`: fast skip due to byteset filter loop improves. - 1/2/3let: Searches for 1, 2, or 3 ascii bytes improves.
2015-08-03syntax: Implement #![no_core]Alex Crichton-1/+2
This commit is an implementation of [RFC 1184][rfc] which tweaks the behavior of the `#![no_std]` attribute and adds a new `#![no_core]` attribute. The `#![no_std]` attribute now injects `extern crate core` at the top of the crate as well as the libcore prelude into all modules (in the same manner as the standard library's prelude). The `#![no_core]` attribute disables both std and core injection. [rfc]: https://github.com/rust-lang/rfcs/pull/1184
2015-08-02StrSearcher: Implement the full two way algorithm in reverse for rfindUlrik Sverdrup-49/+158
Fix quadratic behavior in StrSearcher in reverse search with periodic needles. This commit adds the missing pieces for the "short period" case in reverse search. The short case will show up when the needle is literally periodic, for example "abababab". Two way uses a "critical factorization" of the needle: x = u v. Searching matches v first, if mismatch at character k, skip k forward. Matching u, if mismatch, skip period(x) forward. To avoid O(mn) behavior after mismatch in u, memorize the already matched prefix. The short period case requires that |u| < period(x). For the reverse search we need to compute a different critical factorization x = u' v' where |v'| < period(x), because we are searching for the reversed needle. A short v' also benefits the algorithm in general. The reverse critical factorization is computed quickly by using the same maximal suffix algorithm, but terminating as soon as we have a location with local period equal to period(x). This adds extra fields crit_pos_back and memory_back for the reverse case. The new overhead for TwoWaySearcher::new is low, and additionally I think the "short period" case is uncommon in many applications of string search. The maximal_suffix methods were updated in documentation and the algorithms updated to not use !0 and wrapping add, variable left is now 1 larger, offset 1 smaller. Use periodicity when computing byteset: in the periodic case, just iterate over one period instead of the whole needle. Example before (rfind) after (twoway_rfind) benchmark shows the removal of quadratic behavior. needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100 ``` test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s ```
2015-07-29std: Remove the curious inner moduleAlex Crichton-1/+1
This isn't actually necessary any more with the advent of `$crate` and changes in the compiler to expand macros to `::core::$foo` in the context of a `#![no_std]` crate. The libcore inner module was also trimmed down a bit to the bare bones.
2015-07-21Inline eq_slice_() into eq_slice()Björn Steinbrink-12/+3
eq_slice_() used to be a common implementation for two function that both called it, but of those only eq_slice() is left, so we can as well directly inline the code.
2015-07-21Auto merge of #27168 - brson:stdprim, r=steveklabnikbors-1/+0
This makes the primitive descriptions on the front page read properly as descriptions of types and not of the associated modules. Having the primitive and module docs derived from the same source causes problems, primarily that they can't contain hyperlinks cross-referencing each other. This crates dedicated private modules in `std` to document the primitive types, then for all primitives that have a corresponding module, puts hyperlinks in moth the primitive docs and the module docs cross-linking each other. This should help clear up confusion when readers find themselves on the wrong page. This also removes all the duplicate `#[doc(primitive)]` tags in various places (especially core), so the core docs will no longer attempt to document the primitives for now. Seems like an acceptable tradeoff to get some cleanup for std.
2015-07-20std: Create separate docs for the primitivesBrian Anderson-1/+0
Having the primitive and module docs derived from the same source causes problems, primarily that they can't contain hyperlinks cross-referencing each other. This crates dedicated private modules in `std` to document the primitive types, then for all primitives that have a corresponding module, puts hyperlinks in moth the primitive docs and the module docs cross-linking each other. This should help clear up confusion when readers find themselves on the wrong page.
2015-07-18Fix doc comment parsing in macros.Lee Jeffery-10/+10
2015-07-13Add str::split_at_mutSimon Sapin-0/+15
2015-07-13Implement IndexMut for String and str.Simon Sapin-0/+64
... matching the existing Index impls. There is no reason not to if String implement DerefMut. The code removed in `src/librustc/middle/effect.rs` was added in #9750 to prevent things like `s[0] = 0x80` where `s: String`, but I belive became unnecessary when the Index(Mut) traits were introduced.