about summary refs log tree commit diff
path: root/src/libstd/str.rs
AgeCommit message (Collapse)AuthorLines
2013-08-29std::str: Fix bug in .slice_chars()blake2-ppc-0/+4
`s.slice_chars(a, b)` did not allow the case where `a == s.len()`, this is a bug I introduced last time I touched the method; add a test for this case.
2013-08-29std::str: Use CharIterator in NormalizationIteratorblake2-ppc-16/+17
Just to simplify and not have the iteration logic repeated in multiple places.
2013-08-29std::str: Remove functions count_chars, count_bytesblake2-ppc-40/+0
These are very easy to replace with methods on string slices, basically `.char_len()` and `.len()`. These are the replacement implementations I did to clean these functions up, but seeing this I propose removal: /// ... pub fn count_chars(s: &str, begin: uint, end: uint) -> uint { // .slice() checks the char boundaries s.slice(begin, end).char_len() } /// Counts the number of bytes taken by the first `n` chars in `s` /// starting from byte index `begin`. /// /// Fails if there are less than `n` chars past `begin` pub fn count_bytes<'b>(s: &'b str, begin: uint, n: uint) -> uint { s.slice_from(begin).slice_chars(0, n).len() }
2013-08-29Remove the iter module.Jason Fager-1/+1
Moves the Times trait to num while the question of whether it should exist at all gets hashed out as a completely separate question.
2013-08-27Remove offset_inbounds for an unsafe offset functionAlex Crichton-1/+1
2013-08-26Add a Default trait.Corey Richardson-15/+12
2013-08-26auto merge of #8737 : blake2-ppc/rust/std-str-rsplit, r=huonwbors-73/+305
Make CharSplitIterator double-ended which is simple given that the operation is symmetric, once the split-N feature is factored out into its own adaptor. `.rsplitn_iter()` allows splitting `N` times from the back of a string, so it is a completely new feature. With the double-ended impl, `.split_iter()`, `.line_iter()`, `.word_iter()` all allow picking off elements from either end. `split_options_iter` is removed with the factoring of the split- and split-N- iterators, instead there is `split_terminator_iter`. --- Add benchmarks using `#[bench]` and tune CharSplitIterator a bit after Huon Wilson's suggestions Benchmarks 1-5 do the same split using different implementations of `CharEq`, all splitting an ascii string on ascii space. Benchmarks 6-7 split a unicode string on an ascii char. Before this PR test str::bench::split_iter_ascii ... bench: 166 ns/iter (+/- 2) test str::bench::split_iter_closure ... bench: 113 ns/iter (+/- 1) test str::bench::split_iter_extern_fn ... bench: 286 ns/iter (+/- 7) test str::bench::split_iter_not_ascii ... bench: 114 ns/iter (+/- 4) test str::bench::split_iter_slice ... bench: 220 ns/iter (+/- 12) test str::bench::split_iter_unicode_ascii ... bench: 217 ns/iter (+/- 3) test str::bench::split_iter_unicode_not_ascii ... bench: 248 ns/iter (+/- 3) PR, first commit test str::bench::split_iter_ascii ... bench: 331 ns/iter (+/- 9) test str::bench::split_iter_closure ... bench: 114 ns/iter (+/- 2) test str::bench::split_iter_extern_fn ... bench: 314 ns/iter (+/- 6) test str::bench::split_iter_not_ascii ... bench: 132 ns/iter (+/- 1) test str::bench::split_iter_slice ... bench: 157 ns/iter (+/- 3) test str::bench::split_iter_unicode_ascii ... bench: 502 ns/iter (+/- 64) test str::bench::split_iter_unicode_not_ascii ... bench: 250 ns/iter (+/- 3) PR, final version test str::bench::split_iter_ascii ... bench: 106 ns/iter (+/- 4) test str::bench::split_iter_closure ... bench: 107 ns/iter (+/- 1) test str::bench::split_iter_extern_fn ... bench: 267 ns/iter (+/- 6) test str::bench::split_iter_not_ascii ... bench: 108 ns/iter (+/- 1) test str::bench::split_iter_slice ... bench: 170 ns/iter (+/- 8) test str::bench::split_iter_unicode_ascii ... bench: 128 ns/iter (+/- 5) test str::bench::split_iter_unicode_not_ascii ... bench: 252 ns/iter (+/- 3) --- There are several ways to deal with `CharEq::only_ascii`. It is a performance optimization, so with that in mind, we allow passing bogus char (outside ascii) as long as they don't match. We use a byte value check to make sure we don't split on these (would split substrings in the middle of encoded char). (A more principled way would be to only pass the ascii codepoints to the CharEq when it indicates only_ascii, but that undoes some of the performance optimization.)
2013-08-26std::str: Tune CharSplitIterator after benchmarksblake2-ppc-55/+44
Implement Huon Wilson's suggestions (since the benchmarks agree!). Use `self.sep.matches(byte as char) && byte < 128u8` to match in the only_ascii case so that mistaken matches outside the ascii range can't create invalid substrings. Put the conditional on only_ascii outside the loop.
2013-08-26std::str: bench tests for .split_iter()blake2-ppc-0/+83
2013-08-25Add _opt variants to str byte-conversion functionsKevin Ballard-9/+108
Add _opt variants to from_bytes, from_bytes_owned, and from_bytes_slice. These variants return an Option instead of raising a condition/failing.
2013-08-25std::str: Double-ended CharSplitIteratorblake2-ppc-68/+228
Add new methods `.rsplit_iter()` and `.rsplitn_iter()` for &str. Separate out CharSplitIterator and CharSplitNIterator, CharSplitIterator (`split_iter` and `rsplit_iter`) is made double-ended while `splitn_iter` and `rsplitn_iter` (limited to N splits) are not, since these don't have the same symmetry. With CharSplitIterator being double ended, derived iterators like `line_iter` and `word_iter` are too.
2013-08-24Add OwnedStr::into_bytesSteven Fackler-1/+16
My primary use case here is sending strings across the wire where the intermediate storage is a byte array. The new method ends up avoiding a copy.
2013-08-23Add new function str.truncate()Kevin Ballard-0/+42
2013-08-22Enabled unit tests in std and extra.Vadim Chugunov-4/+0
2013-08-21auto merge of #8590 : blake2-ppc/rust/std-str, r=alexcrichtonbors-131/+254
Implement CharIterator as a separate struct, so that it can be .clone()'d. Fix `.char_range_at_reverse` so that it performs better, closer to the forwards version. This makes the reverse iterators and users like `.rfind()` perform better. Before test str::bench::char_iterator ... bench: 146 ns/iter (+/- 0) test str::bench::char_iterator_ascii ... bench: 397 ns/iter (+/- 49) test str::bench::char_iterator_rev ... bench: 576 ns/iter (+/- 8) test str::bench::char_offset_iterator ... bench: 128 ns/iter (+/- 2) test str::bench::char_offset_iterator_rev ... bench: 425 ns/iter (+/- 59) After test str::bench::char_iterator ... bench: 130 ns/iter (+/- 1) test str::bench::char_iterator_ascii ... bench: 307 ns/iter (+/- 5) test str::bench::char_iterator_rev ... bench: 185 ns/iter (+/- 8) test str::bench::char_offset_iterator ... bench: 131 ns/iter (+/- 13) test str::bench::char_offset_iterator_rev ... bench: 183 ns/iter (+/- 2) To be able to use a string slice to represent the CharIterator, a function `slice_unchecked` is added, that does the same as `slice_bytes` but without any boundary checks. It would be possible to implement CharIterator with pointer arithmetic to make it *much more efficient*, but since vec iterator is still improving, it's too early to attempt to re-implement it in other places. Hopefully CharIterator can be implemented on top of vec iterator without any unsafe code later. Additional changes fix the documentation about null termination.
2013-08-22std::str: Add test for CharIterator .clone()blake2-ppc-0/+8
2013-08-21Add support for performing NFD and NFKD on stringsFlorian Zeitz-0/+143
2013-08-19std::str: Use iterators instead of while loops for CharSplitIteratorblake2-ppc-33/+45
Embed an iterator in the CharSplitIterator struct, and combine that with the former bool `only_ascii`; so use an enum instead.
2013-08-19Add externfn macro and correctly label fixed_stack_segmentsNiko Matsakis-0/+1
2013-08-19std::str: Improve comments for CharIteratorblake2-ppc-1/+9
2013-08-19std::str: Use CharOffsetIterator in slice_charsblake2-ppc-13/+14
2013-08-19std::str: Only check char boundary for end index in .slice_to()blake2-ppc-1/+2
2013-08-19std::str: Correct docstrings for lack of null terminator in ~str and &strblake2-ppc-24/+13
2013-08-19std::str: Use CharOffsetIterator in .find() and .rfind()blake2-ppc-6/+3
2013-08-19std::str: Implement CharIterator separatelyblake2-ppc-35/+68
Let CharIterator be a separate type from CharOffsetIterator (so that CharIterator can be cloned, for example). Implement CharOffsetIterator by using the same technique as the method subslice_offset.
2013-08-19std::str: Add str::raw::slice_uncheckedblake2-ppc-4/+13
Add a function like raw::slice_bytes, but it doesn't check slice boundaries. For iterator use where we always know the begin, end indices are in range.
2013-08-19std::str: Special case char_range_at_reverse so it is fasterblake2-ppc-13/+21
Implement char_range_at_reverse similarly to char_range_at, instead of re-using that method.
2013-08-19std::str: Small fix for sliceblake2-ppc-2/+3
2013-08-19std::str: Bench test for char iteratorsblake2-ppc-0/+56
2013-08-18auto merge of #8555 : chris-morgan/rust/time-clone, r=huonwbors-1/+15
I need `Clone` for `Tm` for my latest work on [rust-http](https://github.com/chris-morgan/rust-http) (static typing for headers, and headers like `Date` are a time), so here it is. @huonw recommended deriving DeepClone while I was at it. I also had to implement `DeepClone` for `~str` to get a derived implementation of `DeepClone` for `Tm`; I did `@str` while I was at it, for consistency.
2013-08-16Implement DeepClone for str types.Chris Morgan-1/+15
2013-08-16doc: correct spelling in documentation.Huon Wilson-2/+2
2013-08-15auto merge of #8490 : huonw/rust/fromiterator-extendable, r=catamorphismbors-4/+4
If they are on the trait then it is extremely annoying to use them as generic parameters to a function, e.g. with the iterator param on the trait itself, if one was to pass an Extendable<int> to a function that filled it either from a Range or a Map<VecIterator>, one needs to write something like: fn foo<E: Extendable<int, Range<int>> + Extendable<int, Map<&'self int, int, VecIterator<int>>> (e: &mut E, ...) { ... } since using a generic, i.e. `foo<E: Extendable<int, I>, I: Iterator<int>>` means that `foo` takes 2 type parameters, and the caller has to specify them (which doesn't work anyway, as they'll mismatch with the iterators used in `foo` itself). This patch changes it to: fn foo<E: Extendable<int>>(e: &mut E, ...) { ... }
2013-08-15std: Move the iterator param on FromIterator and Extendable to the method.Huon Wilson-4/+4
If they are on the trait then it is extremely annoying to use them as generic parameters to a function, e.g. with the iterator param on the trait itself, if one was to pass an Extendable<int> to a function that filled it either from a Range or a Map<VecIterator>, one needs to write something like: fn foo<E: Extendable<int, Range<int>> + Extendable<int, Map<&'self int, int, VecIterator<int>>> (e: &mut E, ...) { ... } since using a generic, i.e. `foo<E: Extendable<int, I>, I: Iterator<int>>` means that `foo` takes 2 type parameters, and the caller has to specify them (which doesn't work anyway, as they'll mismatch with the iterators used in `foo` itself). This patch changes it to: fn foo<E: Extendable<int>>(e: &mut E, ...) { ... }
2013-08-14Methodyfied the string ascii extionsion functionsMarvin Löbel-1/+1
Added into_owned() method for vectors Added DoubleEnded Iterator impl to Option Renamed nil.rs to unit.rs
2013-08-13auto merge of #8446 : alexcrichton/rust/ifmt++, r=graydonbors-41/+12
This includes a number of improvements to `ifmt!` * Implements formatting arguments -- `{:0.5x}` works now * Formatting now works on all integer widths, not just `int` and `uint` * Added a large doc block to `std::fmt` which should help explain what `ifmt!` is all about * Added floating point formatters, although they have the same pitfalls from before (they're just proof-of-concept now) Closed a couple of issues along the way, yay! Once this gets into a snapshot, I'll start looking into removing all of `fmt`
2013-08-12Forbid pub/priv where it has no effectAlex Crichton-6/+6
Closes #5495
2013-08-12Implement formatting arguments for strings and integersAlex Crichton-41/+12
Closes #1651
2013-08-12fix build with the new snapshot compilerDaniel Micay-426/+2
2013-08-11move `strdup_uniq` lang item to std::strDaniel Micay-0/+8
2013-08-11str: optimize `with_capacity`Daniel Micay-3/+21
before: test bench_with_capacity ... bench: 104 ns/iter (+/- 4) after: test bench_with_capacity ... bench: 56 ns/iter (+/- 1)
2013-08-10std: fix the non-stage0 str::raw::slice_bytes which broke in a mergeErick Tryzelaar-1/+1
2013-08-10std: Transform.find_ -> .findErick Tryzelaar-2/+2
2013-08-10std: Iterator.len_ -> .lenErick Tryzelaar-1/+1
2013-08-10std: Rename Iterator.transform -> .mapErick Tryzelaar-8/+8
cc #5898
2013-08-10std: merge Iterator and IteratorUtilErick Tryzelaar-2/+1
2013-08-10std: merge iterator::DoubleEndedIterator and DoubleEndedIteratorUtilErick Tryzelaar-1/+1
2013-08-09Merge remote-tracking branch 'remotes/origin/master' into ↵Erick Tryzelaar-18/+0
remove-str-trailing-nulls
2013-08-09Remove redundant Ord method impls.OGINO Masanori-18/+0
Basically, generic containers should not use the default methods since a type of elements may not guarantees total order. str could use them since u8's Ord guarantees total order. Floating point numbers are also broken with the default methods because of NaN. Thanks for @thestinger. Timespec also guarantees total order AIUI. I'm unsure whether extra::semver::Identifier does so I left it alone. Proof needed. Signed-off-by: OGINO Masanori <masanori.ogino@gmail.com>
2013-08-08Merge remote-tracking branch 'remotes/origin/master' into ↵Erick Tryzelaar-11/+11
remove-str-trailing-nulls