about summary refs log tree commit diff
path: root/src/libcore/str.rs
AgeCommit message (Collapse)AuthorLines
2014-10-07Rename slicing methodsNick Cameron-0/+23
2014-10-07Use slice syntax instead of slice_to, etc.Nick Cameron-8/+8
2014-10-02Revert "Use slice syntax instead of slice_to, etc."Aaron Turon-8/+8
This reverts commit 40b9f5ded50ac4ce8c9323921ec556ad611af6b7.
2014-10-02Revert "Review and rebasing changes"Aaron Turon-61/+23
This reverts commit 6e0611a48707a1f5d90aee32a02b2b15957ef25b.
2014-10-02Review and rebasing changesNick Cameron-23/+61
2014-10-02Use slice syntax instead of slice_to, etc.Nick Cameron-8/+8
2014-09-30librustc: Forbid `..` in range patterns.Patrick Walton-7/+7
This breaks code that looks like: match foo { 1..3 => { ... } } Instead, write: match foo { 1...3 => { ... } } Closes #17295. [breaking-change]
2014-09-26Implement Slice for String and strSteven Fackler-0/+23
Closes #17502
2014-09-17rollup merge of #16936 : nham/two_way_makeoverAlex Crichton-10/+90
2014-09-12Document why `core::str::Searcher::new` doesn't overflowTobias Bucher-0/+3
2014-09-03Fix spelling errors and capitalization.Joseph Crail-4/+4
2014-09-02Add many comments to TwoWaySearcher.nham-10/+87
2014-09-02core: Make TwoWaySearcher reset its prefix memory when shifting by bytesetnham-0/+3
Closes #16878.
2014-08-30Unify non-snake-case lints and non-uppercase statics lintsP1start-27/+27
This unifies the `non_snake_case_functions` and `uppercase_variables` lints into one lint, `non_snake_case`. It also now checks for non-snake-case modules. This also extends the non-camel-case types lint to check type parameters, and merges the `non_uppercase_pattern_statics` lint into the `non_uppercase_statics` lint. Because the `uppercase_variables` lint is now part of the `non_snake_case` lint, all non-snake-case variables that start with lowercase characters (such as `fooBar`) will now trigger the `non_snake_case` lint. New code should be updated to use the new `non_snake_case` lint instead of the previous `non_snake_case_functions` and `uppercase_variables` lints. All use of the `non_uppercase_pattern_statics` should be replaced with the `non_uppercase_statics` lint. Any code that previously contained non-snake-case module or variable names should be updated to use snake case names or disable the `non_snake_case` lint. Any code with non-camel-case type parameters should be changed to use camel case or disable the `non_camel_case_types` lint. [breaking-change]
2014-08-26Use temp vars for implicit coercion to ^[T]Nick Cameron-7/+12
2014-08-24doc: fix some typos in the GuideTshepang Lekhonkhobe-2/+2
2014-08-24auto merge of #16698 : bluss/rust/slice-bloat, r=huonwbors-11/+31
These are somewhat stop-gap solutions to address #16625 core: Separate failure formatting in str methods slice, slice_to, slice_from Use a separate inline-never function to format failure message for str::slice() errors. Using strcat's idea, this makes sure no formatting code from failure is inlined when str::slice() is inlined. The number of `unreachable` being inlined when usingi `.slice()` drops from 5 to just 1. The testcase: ``` #![crate_type = "lib"] pub fn slice(x: &str, a: uint, b: uint) -> &str { x.slice(a, b) } ``` shrinks from 16.9 kB to 3.3 kB llvm IR, and the number of `unreachable` drops from 5 to 1.
2014-08-23auto merge of #16612 : nham/rust/twoway_searcher_fix, r=alexcrichtonbors-1/+13
There is a check in TwoWaySearcher::new to determine whether the needle is periodic. This is needed because during searching when a match fails, we cannot advance the position by the entire length of the needle when it is periodic, but can only advance by the length of the period. The reason "bananas".contains("nana") (and similar searches) were returning false was because the periodicity check was wrong. Closes #16589 Also, thanks to @Gankro, who came up with many buggy examples.
2014-08-23core: Separate failure formatting in str methods slice, slice_to, slice_fromroot-11/+31
Use a separate inline-never function to format failure message for str::slice() errors. Using strcat's idea, this makes sure no formatting code from failure is inlined when str::slice() is inlined. The number of `unreachable` being inlined when usingi `.slice()` drops from 5 to just 1.
2014-08-22Improve TwoWaySearcher comments.nham-3/+10
2014-08-20Fix TwoWaySearcher to work when used with periodic needles.nham-1/+6
There is a check in TwoWaySearcher::new to determine whether the needle is periodic. This is needed because during searching when a match fails, we cannot advance the position by the entire length of the needle when it is periodic, but can only advance by the length of the period. The reason "bananas".contains("nana") (and similar searches) were returning false was because the periodicity check was wrong. Closes #16589
2014-08-19Add examples for some StrSlice methods.nham-0/+46
2014-08-18Fix underflow bug in core::str::Searcher::new for haystacks of length < 20nham-1/+1
2014-08-16Optimized IR generation for UTF-8 and UTF-16 encodingMarvin Löbel-2/+2
- Both can now be inlined and constant folded away - Both can no longer cause failure - Both now return an `Option` instead Removed debug `assert!()`s over the valid ranges of a `char` - It affected optimizations due to unwinding - Char handling is now sound enought that they became uneccessary
2014-08-13core: Change the argument order on splitn and rsplitn for strs.Brian Anderson-12/+12
This makes it consistent with the same functions for slices, and allows the search closure to be specified last. [breaking-change]
2014-08-13std: Rename various slice traits for consistencyBrian Anderson-2/+2
ImmutableVector -> ImmutableSlice ImmutableEqVector -> ImmutableEqSlice ImmutableOrdVector -> ImmutableOrdSlice MutableVector -> MutableSlice MutableVectorAllocating -> MutableSliceAllocating MutableCloneableVector -> MutableCloneableSlice MutableOrdVector -> MutableOrdSlice These are all in the prelude so most code will not break. [breaking-change]
2014-08-08Register new snapshot 12e0f72Niko Matsakis-3/+0
2014-08-06Use byte literals in libcorenham-1/+1
2014-07-24librustc: Stop desugaring `for` expressions and translate them directly.Patrick Walton-2/+5
This makes edge cases in which the `Iterator` trait was not in scope and/or `Option` or its variants were not in scope work properly. This breaks code that looks like: struct MyStruct { ... } impl MyStruct { fn next(&mut self) -> Option<int> { ... } } for x in MyStruct { ... } { ... } Change ad-hoc `next` methods like the above to implementations of the `Iterator` trait. For example: impl Iterator<int> for MyStruct { fn next(&mut self) -> Option<int> { ... } } Closes #15392. [breaking-change]
2014-07-22auto merge of #15867 : cmr/rust/rewrite-lexer4, r=alexcrichtonbors-0/+2
2014-07-21ignore-lexer-test to broken files and remove some tray hyphensCorey Richardson-0/+2
I blame @ChrisMorgan for the hyphens.
2014-07-19Simplify str CharOffsets iteratorroot-7/+7
Only one uint is needed to keep track of the offset from the original full string.
2014-07-19Write multibyte case for str Chars iterator in-lineroot-59/+48
Thanks to comments from @alexcrichton, write the next/next_back function bodies without nested functions in a more top-to-bottom flow style. Also improve comment style and motivate the unsafe blocks with comments.
2014-07-19Clarify str Chars iterator implementationroot-13/+16
Thanks to comments from @huonw, clarify decoding details and use statics for important constants for UTF-8 decoding. Convert some magic numbers scattered in the same file to use the statics too.
2014-07-17core::str: Implement Chars iterator using slice::Itemsroot-44/+114
Re-use the vector iterator to implement the chars iterator. The iterator uses our guarantee that the string contains valid UTF-8, but its only unsafe code is transmuting the decoded u32 into char.
2014-07-09auto merge of #15283 : kwantam/rust/master, r=alexcrichtonbors-82/+1
Add libunicode; move unicode functions from core - created new crate, libunicode, below libstd - split `Char` trait into `Char` (libcore) and `UnicodeChar` (libunicode) - Unicode-aware functions now live in libunicode - `is_alphabetic`, `is_XID_start`, `is_XID_continue`, `is_lowercase`, `is_uppercase`, `is_whitespace`, `is_alphanumeric`, `is_control`, `is_digit`, `to_uppercase`, `to_lowercase` - added `width` method in UnicodeChar trait - determines printed width of character in columns, or None if it is a non-NULL control character - takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise) - split `StrSlice` into `StrSlice` (libcore) and `UnicodeStrSlice` (libunicode) - functionality formerly in `StrSlice` that relied upon Unicode functionality from `Char` is now in `UnicodeStrSlice` - `words`, `is_whitespace`, `is_alphanumeric`, `trim`, `trim_left`, `trim_right` - also moved `Words` type alias into libunicode because `words` method is in `UnicodeStrSlice` - unified Unicode tables from libcollections, libcore, and libregex into libunicode - updated `unicode.py` in `src/etc` to generate aforementioned tables - generated new tables based on latest Unicode data - added `UnicodeChar` and `UnicodeStrSlice` traits to prelude - libunicode is now the collection point for the `std::char` module, combining the libunicode functionality with the `Char` functionality from libcore - thus, moved doc comment for `char` from `core::char` to `unicode::char` - libcollections remains the collection point for `std::str` The Unicode-aware functions that previously lived in the `Char` and `StrSlice` traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and `use` the `UnicodeChar` and/or `UnicodeStrSlice` traits: extern crate unicode; use unicode::UnicodeChar; use unicode::UnicodeStrSlice; use unicode::Words; // if you want to use the words() method NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude. closes #15224 [breaking-change]
2014-07-09str: use more helpful assertion failure messagesCorey Richardson-2/+5
2014-07-07Add libunicode; move unicode functions from corekwantam-82/+1
- created new crate, libunicode, below libstd - split Char trait into Char (libcore) and UnicodeChar (libunicode) - Unicode-aware functions now live in libunicode - is_alphabetic, is_XID_start, is_XID_continue, is_lowercase, is_uppercase, is_whitespace, is_alphanumeric, is_control, is_digit, to_uppercase, to_lowercase - added width method in UnicodeChar trait - determines printed width of character in columns, or None if it is a non-NULL control character - takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise) - split StrSlice into StrSlice (libcore) and UnicodeStrSlice (libunicode) - functionality formerly in StrSlice that relied upon Unicode functionality from Char is now in UnicodeStrSlice - words, is_whitespace, is_alphanumeric, trim, trim_left, trim_right - also moved Words type alias into libunicode because words method is in UnicodeStrSlice - unified Unicode tables from libcollections, libcore, and libregex into libunicode - updated unicode.py in src/etc to generate aforementioned tables - generated new tables based on latest Unicode data - added UnicodeChar and UnicodeStrSlice traits to prelude - libunicode is now the collection point for the std::char module, combining the libunicode functionality with the Char functionality from libcore - thus, moved doc comment for char from core::char to unicode::char - libcollections remains the collection point for std::str The Unicode-aware functions that previously lived in the Char and StrSlice traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and use the UnicodeChar and/or UnicodeStrSlice traits: extern crate unicode; use unicode::UnicodeChar; use unicode::UnicodeStrSlice; use unicode::Words; // if you want to use the words() method NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude. closes #15224 [breaking-change]
2014-07-01rustc: Remove `&str` indexing from the language.Brian Anderson-14/+16
Being able to index into the bytes of a string encourages poor UTF-8 hygiene. To get a view of `&[u8]` from either a `String` or `&str` slice, use the `as_bytes()` method. Closes #12710. [breaking-change]
2014-06-30auto merge of #14613 : schmee/rust/utf16-iterator, r=huonwbors-1/+45
Closes #14358. ~~The tests are not yet moved to `utf16_iter`, so this probably won't compile. I'm submitting this PR anyway so it can be reviewed and since it was mentioned in #14611.~~ EDIT: Tests now use `utf16_iter`. This deprecates `.to_utf16`. `x.to_utf16()` should be replaced by either `x.utf16_iter().collect::<Vec<u16>>()` (the type annotation may be optional), or just `x.utf16_iter()` directly, if it can be used in an iterator context. [breaking-change] cc @huonw
2014-06-30Add `utf16_units`John Schmidt-1/+45
This deprecates `.to_utf16`. `x.to_utf16()` should be replaced by either `x.utf16_units().collect::<Vec<u16>>()` (the type annotation may be optional), or just `x.utf16_units()` directly, if it can be used in an iterator context. Closes #14358 [breaking-change]
2014-06-29Implement RFC#28: Add PartialOrd::partial_cmpSteven Fackler-2/+4
I ended up altering the semantics of Json's PartialOrd implementation. It used to be the case that Null < Null, but I can't think of any reason for an ordering other than the default one so I just switched it over to using the derived implementation. This also fixes broken `PartialOrd` implementations for `Vec` and `TreeMap`. RFC: 0028-partial-cmp
2014-06-29Extract tests from libcore to a separate crateSteven Fackler-12/+0
Libcore's test infrastructure is complicated by the fact that many lang items are defined in the crate. The current approach (realcore/realstd imports) is hacky and hard to work with (tests inside of core::cmp haven't been run for months!). Moving tests to a separate crate does mean that they can only test the public API of libcore, but I don't feel that that is too much of an issue. The only tests that I had to get rid of were some checking the various numeric formatters, but those are also exercised through normal format! calls in other tests.
2014-06-28Rename all raw pointers as necessaryAlex Crichton-7/+7
2014-06-24librustc: Remove the fallback to `int` from typechecking.Niko Matsakis-2/+5
This breaks a fair amount of code. The typical patterns are: * `for _ in range(0, 10)`: change to `for _ in range(0u, 10)`; * `println!("{}", 3)`: change to `println!("{}", 3i)`; * `[1, 2, 3].len()`: change to `[1i, 2, 3].len()`. RFC #30. Closes #6023. [breaking-change]
2014-06-17Add a b"xx" byte string literal of type &'static [u8].Simon Sapin-0/+4
2014-06-08core: Rename `container` mod to `collections`. Closes #12543Brian Anderson-4/+4
Also renames the `Container` trait to `Collection`. [breaking-change]
2014-06-08Fix spelling errors in comments.Joseph Crail-2/+2
2014-06-06Rename Iterator::len to countAaron Turon-2/+1
This commit carries out the request from issue #14678: > The method `Iterator::len()` is surprising, as all the other uses of > `len()` do not consume the value. `len()` would make more sense to be > called `count()`, but that would collide with the current > `Iterator::count(|T| -> bool) -> unit` method. That method, however, is > a bit redundant, and can be easily replaced with > `iter.filter(|x| x < 5).count()`. > After this change, we could then define the `len()` method > on `iter::ExactSize`. Closes #14678. [breaking-change]
2014-06-02Document failure cases for `char_at` and friends.Aaron Turon-2/+17