| Age | Commit message (Collapse) | Author | Lines |
|
|
|
Iterators! Use them (in `is_utf16`), create them (in `utf16_items`).
Handle errors gracefully (`from_utf16_lossy`) and `from_utf16` returning `Option<~str>` instead of failing.
Add a pile of tests.
|
|
Many of the functions interacting with Windows APIs allocate a vector of
0's and do not retrieve a length directly from the API call, and so need
to be sure to remove the unmodified junk at the end of the vector.
|
|
This makes it very slightly faster, especially when the string is valid
UTF-8, and completely removes the use of `unsafe` from the first half.
Before:
from_utf8_lossy_100_ascii ... bench: 151 ns/iter (+/- 17)
from_utf8_lossy_100_invalid ... bench: 447 ns/iter (+/- 33)
from_utf8_lossy_100_multibyte ... bench: 135 ns/iter (+/- 4)
from_utf8_lossy_invalid ... bench: 124 ns/iter (+/- 10
After:
from_utf8_lossy_100_ascii ... bench: 119 ns/iter (+/- 8)
from_utf8_lossy_100_invalid ... bench: 454 ns/iter (+/- 16)
from_utf8_lossy_100_multibyte ... bench: 116 ns/iter (+/- 9)
from_utf8_lossy_invalid ... bench: 119 ns/iter (+/- 9)
|
|
This uses a vector iterator to avoid the necessity for unsafe indexing,
and makes this function slightly faster. Unfortunately #11751 means that
the iterator comes with repeated `null` checks which means the
pure-ASCII case still has room for significant improvement (and the
other cases too, but it's most significant for just ASCII).
Before:
is_utf8_100_ascii ... bench: 143 ns/iter (+/- 6)
is_utf8_100_multibyte ... bench: 134 ns/iter (+/- 4)
After:
is_utf8_100_ascii ... bench: 123 ns/iter (+/- 4)
is_utf8_100_multibyte ... bench: 115 ns/iter (+/- 5)
|
|
The rest of the codebase is moving toward avoiding `fail!` so we do it
here too!
|
|
Fixes #12318.
|
|
This replaces the iterator with one that handles lone surrogates
gracefully and uses that to implement `from_utf16_lossy` which replaces
invalid `u16`s with U+FFFD.
|
|
Fixes #12316.
|
|
Most of the tests are randomly generated with Python 3 and rely on it's
UTF-16be encoder/decoder being correct.
|
|
These are adequately covered by the Tuple2 trait.
|
|
|
|
|
|
mut_offset)
|
|
|
|
|
|
Declare a `type SendStr = MaybeOwned<'static>` to ease readibility of
types that needed the old SendStr behavior.
Implement all the traits for MaybeOwned that SendStr used to implement.
|
|
MaybeOwned allows from_utf8_lossy to avoid allocation if there are no
invalid bytes in the input.
|
|
from_utf8_lossy() takes a byte vector and produces a ~str, converting
any invalid UTF-8 sequence into the U+FFFD REPLACEMENT CHARACTER.
The replacement follows the guidelines in §5.22 Best Practice for U+FFFD
Substitution from the Unicode Standard (Version 6.2)[1], which also
matches the WHATWG rules for utf-8 decoding[2].
[1]: http://www.unicode.org/versions/Unicode6.2.0/ch05.pdf
[2]: http://encoding.spec.whatwg.org/#utf-8
|
|
Changes in std::{str,vec,hashmap} and extra::{priority_queue,ringbuf}.
Fixes #11949
|
|
Changes in std::{str,vec,hashmap} and extra::{priority_queue,ringbuf}.
Fixes #11949
|
|
|
|
|
|
|
|
|
|
|
|
Also rename `next_power_of_two_opt` to `checked_next_power_of_two`.
|
|
|
|
|
|
These are either returned from public functions, and really should
appear in the documentation, but don't since they're private, or are
implementation details that are currently public.
|
|
Consensus leaned in favour of using rev instead of flip.
|
|
Renamed the invert() function in iter.rs to flip().
Also renamed the Invert<T> type to Flip<T>.
Some related code comments changed. Documentation that I could find has
been updated, and all the instances I could locate where the
function/type were called have been updated as well.
|
|
|
|
from_utf8_owned() behavior
|
|
behavior
|
|
|
|
|
|
|
|
Rename existing iterators to get rid of the Iterator suffix and to
give them names that better describe the things being iterated over.
|
|
The `print!` and `println!` macros are now the preferred method of printing, and so there is no reason to export the `stdio` functions in the prelude. The functions have also been replaced by their macro counterparts in the tutorial and other documentation so that newcomers don't get confused about what they should be using.
|
|
|
|
Fallout from the previous commits
|
|
Updates as mentioned in #11135
|
|
I could not run the tests because of unrelated building issue, sorry about that.
|
|
Conflicts:
src/librustc/middle/lint.rs
|
|
|
|
|
|
|
|
This commit uniforms the short title of modules provided by libstd,
in order to make their roles more explicit when glancing at the index.
Signed-off-by: Luca Bruno <lucab@debian.org>
|
|
|