summary refs log tree commit diff
path: root/src/libcollections/string.rs
AgeCommit message (Collapse)AuthorLines
2014-10-07Rename slicing methodsNick Cameron-0/+23
2014-10-07Use slice syntax instead of slice_to, etc.Nick Cameron-3/+3
2014-10-05String::truncate doc: also fails if not a char boundarySimon Sapin-4/+5
2014-10-03Fix preallocation amount in String::from_utf16Simon Sapin-1/+1
`v.len()` counts code units, not UTF-16 bytes. The lower bound is one UTF-8 byte per code unit, not per two code units.
2014-10-02Revert "Use slice syntax instead of slice_to, etc."Aaron Turon-3/+3
This reverts commit 40b9f5ded50ac4ce8c9323921ec556ad611af6b7.
2014-10-02Revert "Put slicing syntax behind a feature gate."Aaron Turon-2/+2
This reverts commit 95cfc35607ccf5f02f02de56a35a9ef50fa23a82.
2014-10-02Revert "Review and rebasing changes"Aaron Turon-29/+0
This reverts commit 6e0611a48707a1f5d90aee32a02b2b15957ef25b.
2014-10-02Review and rebasing changesNick Cameron-0/+29
2014-10-02Put slicing syntax behind a feature gate.Nick Cameron-2/+2
[breaking-change] If you are using slicing syntax you will need to add #![feature(slicing_syntax)] to your crate.
2014-10-02Use slice syntax instead of slice_to, etc.Nick Cameron-3/+3
2014-09-30librustc: Forbid `..` in range patterns.Patrick Walton-7/+7
This breaks code that looks like: match foo { 1..3 => { ... } } Instead, write: match foo { 1...3 => { ... } } Closes #17295. [breaking-change]
2014-09-29rollup merge of #17585 : sfackler/string-sliceAlex Crichton-0/+32
2014-09-26Implement Slice for String and strSteven Fackler-0/+32
Closes #17502
2014-09-27Correct stability marker in string.rsSqueaky-1/+1
2014-09-23Deal with the fallout of string stabilizationAlex Crichton-25/+34
2014-09-22collections: Deprecate shift_char for insert/removeAlex Crichton-15/+87
This commit deprecates the String::shift_char() function in favor of the addition of an insert()/remove() pair of functions. This aligns the API with Vec in that characters can be inserted at arbitrary positions. Additionaly, there is no `_char` suffix due to the rationaled laid out in the previous commit. These functions are both introduced as unstable as their failure semantics, while in line with slices/vectors, are uncertain about whether they should remain the same.
2014-09-22collections: Stabilize StringAlex Crichton-9/+77
# Rationale When dealing with strings, many functions deal with either a `char` (unicode codepoint) or a byte (utf-8 encoding related). There is often an inconsistent way in which methods are referred to as to whether they contain "byte", "char", or nothing in their name. There are also issues open to rename *all* methods to reflect that they operate on utf8 encodings or bytes (e.g. utf8_len() or byte_len()). The current state of String seems to largely be what is desired, so this PR proposes the following rationale for methods dealing with bytes or characters: > When constructing a string, the input encoding *must* be mentioned (e.g. > from_utf8). This makes it clear what exactly the input type is expected to be > in terms of encoding. > > When a method operates on anything related to an *index* within the string > such as length, capacity, position, etc, the method *implicitly* operates on > bytes. It is an understood fact that String is a utf-8 encoded string, and > burdening all methods with "bytes" would be redundant. > > When a method operates on the *contents* of a string, such as push() or pop(), > then "char" is the default type. A String can loosely be thought of as being a > collection of unicode codepoints, but not all collection-related operations > make sense because some can be woefully inefficient. # Method stabilization The following methods have been marked #[stable] * The String type itself * String::new * String::with_capacity * String::from_utf16_lossy * String::into_bytes * String::as_bytes * String::len * String::clear * String::as_slice The following methods have been marked #[unstable] * String::from_utf8 - The error type in the returned `Result` may change to provide a nicer message when it's `unwrap()`'d * String::from_utf8_lossy - The returned `MaybeOwned` type still needs stabilization * String::from_utf16 - The return type may change to become a `Result` which includes more contextual information like where the error occurred. * String::from_chars - This is equivalent to iter().collect(), but currently not as ergonomic. * String::from_char - This method is the equivalent of Vec::from_elem, and has been marked #[unstable] becuase it can be seen as a duplicate of iterator-based functionality as well as possibly being renamed. * String::push_str - This *can* be emulated with .extend(foo.chars()), but is less efficient because of decoding/encoding. Due to the desire to minimize API surface this may be able to be removed in the future for something possibly generic with no loss in performance. * String::grow - This is a duplicate of iterator-based functionality, which may become more ergonomic in the future. * String::capacity - This function was just added. * String::push - This function was just added. * String::pop - This function was just added. * String::truncate - The failure conventions around String methods and byte indices isn't totally clear at this time, so the failure semantics and return value of this method are subject to change. * String::as_mut_vec - the naming of this method may change. * string::raw::* - these functions are all waiting on [an RFC][2] [2]: https://github.com/rust-lang/rfcs/pull/240 The following method have been marked #[experimental] * String::from_str - This function only exists as it's more efficient than to_string(), but having a less ergonomic function for performance reasons isn't the greatest reason to keep it around. Like Vec::push_all, this has been marked experimental for now. The following methods have been #[deprecated] * String::append - This method has been deprecated to remain consistent with the deprecation of Vec::append. While convenient, it is one of the only functional-style apis on String, and requires more though as to whether it belongs as a first-class method or now (and how it relates to other collections). * String::from_byte - This is fairly rare functionality and can be emulated with str::from_utf8 plus an assert plus a call to to_string(). Additionally, String::from_char could possibly be used. * String::byte_capacity - Renamed to String::capacity due to the rationale above. * String::push_char - Renamed to String::push due to the rationale above. * String::pop_char - Renamed to String::pop due to the rationale above. * String::push_bytes - There are a number of `unsafe` functions on the `String` type which allow bypassing utf-8 checks. These have all been deprecated in favor of calling `.as_mut_vec()` and then operating directly on the vector returned. These methods were deprecated because naming them with relation to other methods was difficult to rationalize and it's arguably more composable to call .as_mut_vec(). * String::as_mut_bytes - See push_bytes * String::push_byte - See push_bytes * String::pop_byte - See push_bytes * String::shift_byte - See push_bytes # Reservation methods This commit does not yet touch the methods for reserving bytes. The methods on Vec have also not yet been modified. These methods are discussed in the upcoming [Collections reform RFC][1] [1]: https://github.com/aturon/rfcs/blob/collections-conventions/active/0000-collections-conventions.md#implicit-growth
2014-09-21Fix fallout from Vec stabilizationAlex Crichton-3/+2
2014-09-19Add enum variants to the type namespaceNick Cameron-3/+3
Change to resolve and update compiler and libs for uses. [breaking-change] Enum variants are now in both the value and type namespaces. This means that if you have a variant with the same name as a type in scope in a module, you will get a name clash and thus an error. The solution is to either rename the type or the variant.
2014-08-26Rebasing changesNick Cameron-2/+4
2014-08-19A few minor documentation fixesP1start-17/+17
2014-08-18libsyntax: Remove the `use foo = bar` syntax from the language in favorPatrick Walton-2/+2
of `use bar as foo`. Change all uses of `use foo = bar` to `use bar as foo`. Implements RFC #47. Closes #16461. [breaking-change]
2014-08-17auto merge of #16498 : Kimundi/rust/inline-utf-encoding, r=alexcrichtonbors-1/+1
The first commit improves code generation through a few changes: - The `#[inline]` attributes allow llvm to constant fold the encoding step away in certain situations. For example, code like this changes from a call to `encode_utf8` in a inner loop to the pushing of a byte constant: ```rust let mut s = String::new(); for _ in range(0u, 21) { s.push_char('a'); } ``` - Both methods changed their semantic from causing run time failure if the target buffer is not large enough to returning `None` instead. This makes llvm no longer emit code for causing failure for these methods. - A few debug `assert!()` calls got removed because they affected code generation due to unwinding, and where basically unnecessary with today's sound handling of `char` as a Unicode scalar value. ~~The second commit is optional. It changes the methods from regular indexing with the `dst[i]` syntax to unsafe indexing with `dst.unsafe_mut_ref(i)`. This does not change code generation directly - in both cases llvm is smart enough to see that there can never be an out-of-bounds access. But it makes it emit a `nounwind` attribute for the function. However, I'm not sure whether that is a real improvement, so if there is any objection to this I'll remove the commit.~~ This changes how the methods behave on a too small buffer, so this is a [breaking-change]
2014-08-16librustc: Forbid external crates, imports, and/or items from beingPatrick Walton-2/+1
declared with the same name in the same scope. This breaks several common patterns. First are unused imports: use foo::bar; use baz::bar; Change this code to the following: use baz::bar; Second, this patch breaks globs that import names that are shadowed by subsequent imports. For example: use foo::*; // including `bar` use baz::bar; Change this code to remove the glob: use foo::{boo, quux}; use baz::bar; Or qualify all uses of `bar`: use foo::{boo, quux}; use baz; ... baz::bar ... Finally, this patch breaks code that, at top level, explicitly imports `std` and doesn't disable the prelude. extern crate std; Because the prelude imports `std` implicitly, there is no need to explicitly import it; just remove such directives. The old behavior can be opted into via the `import_shadowing` feature gate. Use of this feature gate is discouraged. This implements RFC #116. Closes #16464. [breaking-change]
2014-08-16Optimized IR generation for UTF-8 and UTF-16 encodingMarvin Löbel-1/+1
- Both can now be inlined and constant folded away - Both can no longer cause failure - Both now return an `Option` instead Removed debug `assert!()`s over the valid ranges of a `char` - It affected optimizations due to unwinding - Char handling is now sound enought that they became uneccessary
2014-08-13core: Rename ImmutableSlice::unsafe_ref to unsafe_getBrian Anderson-1/+1
Deprecate the previous.
2014-08-13std: Rename slice::Vector to SliceBrian Anderson-4/+7
This required some contortions because importing both raw::Slice and slice::Slice makes rustc crash. Since `Slice` is in the prelude, this renaming is unlikely to casue breakage. [breaking-change]
2014-08-12Deprecation fallout in libcollectionsAaron Turon-1/+1
2014-08-06Use byte literals in libcollections testsnham-1/+1
2014-08-01Fix misspelled comments.Joseph Crail-1/+1
2014-07-29Fix a whitespace typoErick Tryzelaar-1/+1
2014-07-28doc: Method examples for StringJonas Hietala-25/+261
Reword comments on unsafe methods regarding UTF-8.
2014-07-24Add `string::raw::from_buf`Adolfo Ochagavía-12/+36
2014-07-24Deprecated `String::from_raw_parts`Adolfo Ochagavía-8/+19
Replaced by `string::raw::from_parts` [breaking-change]
2014-07-24Deprecated `str::raw::from_buf_len`Adolfo Ochagavía-0/+25
Replaced by `string::raw::from_buf_len` [breaking-change]
2014-07-24Deprecated `str::raw::from_utf8_owned`Adolfo Ochagavía-0/+13
Replaced by `string::raw::from_utf8` [breaking-change]
2014-07-23Just land alreadyBrian Anderson-1/+1
2014-07-23collections: Move push/pop to MutableSeqBrian Anderson-1/+1
Implement for Vec, DList, RingBuf. Add MutableSeq to the prelude. Since the collections traits are in the prelude most consumers of these methods will continue to work without change. [breaking-change]
2014-07-22auto merge of #15867 : cmr/rust/rewrite-lexer4, r=alexcrichtonbors-0/+2
2014-07-21ignore-lexer-test to broken files and remove some tray hyphensCorey Richardson-0/+2
I blame @ChrisMorgan for the hyphens.
2014-07-21fix string in from_utf8_lossy_100_multibyte benchmarkTed Horst-2/+1
2014-07-15Fix errorsAdolfo Ochagavía-22/+27
2014-07-15Deprecate `str::from_utf8_lossy`Adolfo Ochagavía-34/+231
Use `String::from_utf8_lossy` instead [breaking-change]
2014-07-15Deprecate `str::from_utf16_lossy`Adolfo Ochagavía-0/+107
Use `String::from_utf16_lossy` instead. [breaking-change]
2014-07-15Deprecate `str::from_utf16`Adolfo Ochagavía-0/+26
Use `String::from_utf16` instead [breaking-change]
2014-07-15Deprecate str::from_byteAdolfo Ochagavía-1/+18
Replaced by `String::from_byte` [breaking-change]
2014-07-15Deprecate `str::from_chars`Adolfo Ochagavía-0/+14
Use `String::from_chars` instead [breaking-change]
2014-07-15Deprecate `str::from_utf8_owned`Adolfo Ochagavía-0/+21
Use `String::from_utf8` instead [breaking-change]
2014-07-08std: Rename the `ToStr` trait to `ToString`, and `to_str` to `to_string`.Richo Healey-1/+7
[breaking-change]
2014-07-06Optimize String::push_byte()Simon Sapin-1/+1
``` test new_push_byte ... bench: 6985 ns/iter (+/- 487) = 17 MB/s test old_push_byte ... bench: 19335 ns/iter (+/- 1368) = 6 MB/s ``` ```rust extern crate test; use test::Bencher; static TEXT: &'static str = "\ Unicode est un standard informatique qui permet des échanges \ de textes dans différentes langues, à un niveau mondial."; #[bench] fn old_push_byte(bencher: &mut Bencher) { bencher.bytes = TEXT.len() as u64; bencher.iter(|| { let mut new = String::new(); for b in TEXT.bytes() { unsafe { new.as_mut_vec().push_all([b]) } } }) } #[bench] fn new_push_byte(bencher: &mut Bencher) { bencher.bytes = TEXT.len() as u64; bencher.iter(|| { let mut new = String::new(); for b in TEXT.bytes() { unsafe { new.as_mut_vec().push(b) } } }) } ```