about summary refs log tree commit diff
path: root/src/libcoretest/char.rs
AgeCommit message (Collapse)AuthorLines
2017-04-03Move libXtest into libX/testsStjepan Glavina-404/+0
This change moves: 1. `libcoretest` into `libcore/tests` 2. `libcollectionstest` into `libcollections/tests` This is a follow-up to #39561.
2017-01-11Implement Display for char Escape*, To*case.Clar Charr-125/+99
2016-11-18Fix `fmt::Debug` for strings, e.g. for Chinese charactersTobias Bucher-0/+2
The problem occured due to lines like ``` 3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;; 4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;; ``` in `UnicodeData.txt`, which the script previously interpreted as two characters, although it represents the whole range. Fixes #34318.
2016-09-28[breaking-change] std: change `encode_utf{8,16}()` to take a buffer and ↵tormol-9/+12
return a slice They panic if the buffer is too small.
2016-08-29Implement TryFrom<u32> for charSimon Sapin-0/+10
For symmetry with From<char> for u32.
2016-08-29Implement From<char> for u32, and From<u8> for charSimon Sapin-0/+8
These fit with other From implementations between integer types. This helps the coding style of avoiding the 'as' operator that sometimes silently truncates, and signals that these specific conversions are lossless and infaillible.
2016-08-23Yield Err in char::decode_utf8 per Unicode, like String::from_utf8_lossySimon Sapin-1/+20
2016-08-23Use a macro in test_decode_utf8 to preserve line numbers in panic messages.Simon Sapin-24/+26
2016-07-28Rename `char::escape` to `char::escape_debug` and add tracking issueTobias Bucher-2/+2
2016-07-26Restore `char::escape_default` and add `char::escape` insteadTobias Bucher-2/+45
2016-07-23Escape fewer Unicode codepoints in `Debug` impl of `str`Tobias Bucher-3/+13
Use the same procedure as Python to determine whether a character is printable, described in [PEP 3138]. In particular, this means that the following character classes are escaped: - Cc (Other, Control) - Cf (Other, Format) - Cs (Other, Surrogate), even though they can't appear in Rust strings - Co (Other, Private Use) - Cn (Other, Not Assigned) - Zl (Separator, Line) - Zp (Separator, Paragraph) - Zs (Separator, Space), except for the ASCII space `' '` (`0x20`) This allows for user-friendly inspection of strings that are not English (e.g. compare `"\u{e9}\u{e8}\u{ea}"` to `"éèê"`). Fixes #34318. [PEP 3138]: https://www.python.org/dev/peps/pep-3138/
2016-07-13add core::char::DecodeUtf8M Farkas-Dyck-0/+29
2016-05-26Extend the test for `EscapeUnicode`Andrea Canciani-0/+6
to also check that it is legitimately an `ExactSizeIterator`.
2016-05-04Add test for `EscapeUnicode` specializationsAndrea Canciani-0/+33
2016-04-11std: Stabilize APIs for the 1.9 releaseAlex Crichton-1/+6
This commit applies all stabilizations, renamings, and deprecations that the library team has decided on for the upcoming 1.9 release. All tracking issues have gone through a cycle-long "final comment period" and the specific APIs stabilized/deprecated are: Stable * `std::panic` * `std::panic::catch_unwind` (renamed from `recover`) * `std::panic::resume_unwind` (renamed from `propagate`) * `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`) * `std::panic::UnwindSafe` (renamed from `RecoverSafe`) * `str::is_char_boundary` * `<*const T>::as_ref` * `<*mut T>::as_ref` * `<*mut T>::as_mut` * `AsciiExt::make_ascii_uppercase` * `AsciiExt::make_ascii_lowercase` * `char::decode_utf16` * `char::DecodeUtf16` * `char::DecodeUtf16Error` * `char::DecodeUtf16Error::unpaired_surrogate` * `BTreeSet::take` * `BTreeSet::replace` * `BTreeSet::get` * `HashSet::take` * `HashSet::replace` * `HashSet::get` * `OsString::with_capacity` * `OsString::clear` * `OsString::capacity` * `OsString::reserve` * `OsString::reserve_exact` * `OsStr::is_empty` * `OsStr::len` * `std::os::unix::thread` * `RawPthread` * `JoinHandleExt` * `JoinHandleExt::as_pthread_t` * `JoinHandleExt::into_pthread_t` * `HashSet::hasher` * `HashMap::hasher` * `CommandExt::exec` * `File::try_clone` * `SocketAddr::set_ip` * `SocketAddr::set_port` * `SocketAddrV4::set_ip` * `SocketAddrV4::set_port` * `SocketAddrV6::set_ip` * `SocketAddrV6::set_port` * `SocketAddrV6::set_flowinfo` * `SocketAddrV6::set_scope_id` * `<[T]>::copy_from_slice` * `ptr::read_volatile` * `ptr::write_volatile` * The `#[deprecated]` attribute * `OpenOptions::create_new` Deprecated * `std::raw::Slice` - use raw parts of `slice` module instead * `std::raw::Repr` - use raw parts of `slice` module instead * `str::char_range_at` - use slicing plus `chars()` plus `len_utf8` * `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8` * `str::char_at` - use slicing plus `chars()` * `str::char_at_reverse` - use slicing plus `chars().rev()` * `str::slice_shift_char` - use `chars()` plus `Chars::as_str` * `CommandExt::session_leader` - use `before_exec` instead. Closes #27719 cc #27751 (deprecating the `Slice` bits) Closes #27754 Closes #27780 Closes #27809 Closes #27811 Closes #27830 Closes #28050 Closes #29453 Closes #29791 Closes #29935 Closes #30014 Closes #30752 Closes #31262 cc #31398 (still need to deal with `before_exec`) Closes #31405 Closes #31572 Closes #31755 Closes #31756
2016-03-22std: Change `encode_utf{8,16}` to return iteratorsAlex Crichton-6/+8
Currently these have non-traditional APIs which take a buffer and report how much was filled in, but they're not necessarily ergonomic to use. Returning an iterator which *also* exposes an underlying slice shouldn't result in any performance loss as it's just a lazy version of the same implementation, and it's also much more ergonomic! cc #27784
2016-01-16Make style more uniform, add tests for specialization of .last(), move tests ↵Ticki-0/+40
to libcoretest Remove unused import Fold nth() method into the match expr
2015-08-27Auto merge of #27808 - SimonSapin:utf16decoder, r=alexcrichtonbors-0/+9
* Rename `Utf16Items` to `Utf16Decoder`. "Items" is meaningless. * Generalize it to any `u16` iterator, not just `[u16].iter()` * Make it yield `Result` instead of a custom `Utf16Item` enum that was isomorphic to `Result`. This enable using the `FromIterator for Result` impl. * Replace `Utf16Item::to_char_lossy` with a `Utf16Decoder::lossy` iterator adaptor. This is a [breaking change], but only for users of the unstable `rustc_unicode` crate. I’d like this functionality to be stabilized and re-exported in `std` eventually, as the "low-level equivalent" of `String::from_utf16` and `String::from_utf16_lossy` like #27784 is the low-level equivalent of #27714. CC @aturon, @alexcrichton
2015-08-23Refactor low-level UTF-16 decoding.Simon Sapin-0/+9
* Rename `utf16_items` to `decode_utf16`. "Items" is meaningless. * Move it to `rustc_unicode::char`, exposed in `std::char`. * Generalize it to any `u16` iterable, not just `&[u16]`. * Make it yield `Result` instead of a custom `Utf16Item` enum that was isomorphic to `Result`. This enable using the `FromIterator for Result` impl. * Add a `REPLACEMENT_CHARACTER` constant. * Document how `result.unwrap_or(REPLACEMENT_CHARACTER)` replaces `Utf16Item::to_char_lossy`.
2015-08-20Add a test for char::to_lowercase mapping to more than one `char`.Simon Sapin-21/+17
I was wrong about Unicode not having such language-independent mapping.
2015-08-12Remove all unstable deprecated functionalityAlex Crichton-28/+0
This commit removes all unstable and deprecated functions in the standard library. A release was recently cut (1.3) which makes this a good time for some spring cleaning of the deprecated functions.
2015-06-24Remove char::to_titlecase. Fix #26555Simon Sapin-23/+0
I added it because it was easy (same a `char::to_lowercase`, just a different table), but it doesn’t make sense to have this in std but not str::to_titlecase, which would require https://github.com/unicode-rs/unicode-segmentation At some point in the future this feature will be available (both on char and str) in a crates.io crate.
2015-06-09Move collectionstest::char into coretest::charSimon Sapin-0/+28
2015-06-09Fix coretest::char::test_to_uppercase for complex mappingSimon Sapin-18/+17
2015-04-21Model lexer: Fix remaining issuesPiotr Czarnecki-2/+0
2015-04-16deprecate Unicode functions that will be moved to crates.iokwantam-0/+1
This patch 1. renames libunicode to librustc_unicode, 2. deprecates several pieces of libunicode (see below), and 3. removes references to deprecated functions from librustc_driver and libsyntax. This may change pretty-printed output from these modules in cases involving wide or combining characters used in filenames, identifiers, etc. The following functions are marked deprecated: 1. char.width() and str.width(): --> use unicode-width crate 2. str.graphemes() and str.grapheme_indices(): --> use unicode-segmentation crate 3. str.nfd_chars(), str.nfkd_chars(), str.nfc_chars(), str.nfkc_chars(), char.compose(), char.decompose_canonical(), char.decompose_compatible(), char.canonical_combining_class(): --> use unicode-normalization crate
2015-03-10std: Stabilize more of the `char` moduleAlex Crichton-25/+37
This commit performs another pass over the `std::char` module for stabilization. Some minor cleanup is performed such as migrating documentation from libcore to libunicode (where the `std`-facing trait resides) as well as a slight reorganiation in libunicode itself. Otherwise, the stability modifications made are: * `char::from_digit` is now stable * `CharExt::is_digit` is now stable * `CharExt::to_digit` is now stable * `CharExt::to_{lower,upper}case` are now stable after being modified to return an iterator over characters. While the implementation today has not changed this should allow us to implement the full set of case conversions in unicode where some characters can map to multiple when doing an upper or lower case mapping. * `StrExt::to_{lower,upper}case` was added as unstable for a convenience of not having to worry about characters expanding to more characters when you just want the whole string to get into upper or lower case. This is a breaking change due to the change in the signatures of the `CharExt::to_{upper,lower}case` methods. Code can be updated to use functions like `flat_map` or `collect` to handle the difference. [breaking-change]
2015-03-05Remove integer suffixes where the types in compiled code are identical.Eduard Burtescu-2/+2
2015-02-10Deprecating i/u suffixes in libcoretestAlfie John-12/+12
2015-02-05cleanup: replace `as[_mut]_slice()` calls with deref coercionsJorge Aparicio-2/+2
2015-01-12cleanup: `&foo[0..a]` -> `&foo[..a]`Jorge Aparicio-2/+2
2015-01-07use slicing sugarJorge Aparicio-2/+2
2015-01-07Replace full slice notation with index callsNick Cameron-2/+2
2015-01-03Remove deprecated functionalityAlex Crichton-33/+29
This removes a large array of deprecated functionality, regardless of how recently it was deprecated. The purpose of this commit is to clean out the standard libraries and compiler for the upcoming alpha release. Some notable compiler changes were to enable warnings for all now-deprecated command line arguments (previously the deprecated versions were silently accepted) as well as removing deriving(Zero) entirely (the trait was removed). The distribution no longer contains the libtime or libregex_macros crates. Both of these have been deprecated for some time and are available externally.
2015-01-02Fallout - change array syntax to use `;`Nick Cameron-2/+2
2014-12-16std: Change escape_unicode to use new escapesAlex Crichton-22/+19
This changes the `escape_unicode` method on a `char` to use the new style of unicode escapes in the language. Closes #19811 Closes #19879
2014-12-06libcoretest: remove unnecessary `as_slice()` callsJorge Aparicio-19/+19
2014-11-21core: Add Char::len_utf16Brian Anderson-0/+8
Missing method to pair with len_utf8.
2014-11-21unicode: Rename UnicodeChar::is_digit to is_numericBrian Anderson-6/+6
'Numeric' is the proper name of the unicode character class, and this frees up the word 'digit' for ascii use in libcore. Since I'm going to rename `Char::is_digit_radix` to `is_digit`, I am not leaving a deprecated method in place, because that would just cause name clashes, as both `Char` and `UnicodeChar` are in the prelude. [breaking-change]
2014-11-17Fix fallout from coercion removalNick Cameron-8/+8
2014-11-04libsyntax: Forbid escapes in the inclusive range `\x80`-`\xff` inPatrick Walton-4/+4
Unicode characters and strings. Use `\u0080`-`\u00ff` instead. ASCII/byte literals are unaffected. This PR introduces a new function, `escape_default`, into the ASCII module. This was necessary for the pretty printer to continue to function. RFC #326. Closes #18062. [breaking-change]
2014-10-13Clean up rustc warnings.NODA, Kai-2/+2
compiletest: compact "linux" "macos" etc.as "unix". liballoc: remove a superfluous "use". libcollections: remove invocations of deprecated methods in favor of their suggested replacements and use "_" for a loop counter. libcoretest: remove invocations of deprecated methods; also add "allow(deprecated)" for testing a deprecated method itself. libglob: use "cfg_attr". libgraphviz: add a test for one of data constructors. libgreen: remove a superfluous "use". libnum: "allow(type_overflow)" for type cast into u8 in a test code. librustc: names of static variables should be in upper case. libserialize: v[i] instead of get(). libstd/ascii: to_lowercase() instead of to_lower(). libstd/bitflags: modify AnotherSetOfFlags to use i8 as its backend. It will serve better for testing various aspects of bitflags!. libstd/collections: "allow(deprecated)" for testing a deprecated method itself. libstd/io: remove invocations of deprecated methods and superfluous "use". Also add #[test] where it was missing. libstd/num: introduce a helper function to effectively remove invocations of a deprecated method. libstd/path and rand: remove invocations of deprecated methods and superfluous "use". libstd/task and libsync/comm: "allow(deprecated)" for testing a deprecated method itself. libsync/deque: remove superfluous "unsafe". libsync/mutex and once: names of static variables should be in upper case. libterm: introduce a helper function to effectively remove invocations of a deprecated method. We still see a few warnings about using obsoleted native::task::spawn() in the test modules for libsync. I'm not sure how I should replace them with std::task::TaksBuilder and native::task::NativeTaskBuilder (dependency to libstd?) Signed-off-by: NODA, Kai <nodakai@gmail.com>
2014-10-07Use slice syntax instead of slice_to, etc.Nick Cameron-2/+2
2014-10-02Revert "Use slice syntax instead of slice_to, etc."Aaron Turon-2/+2
This reverts commit 40b9f5ded50ac4ce8c9323921ec556ad611af6b7.
2014-10-02Use slice syntax instead of slice_to, etc.Nick Cameron-2/+2
2014-08-16Optimized IR generation for UTF-8 and UTF-16 encodingMarvin Löbel-2/+2
- Both can now be inlined and constant folded away - Both can no longer cause failure - Both now return an `Option` instead Removed debug `assert!()`s over the valid ranges of a `char` - It affected optimizations due to unwinding - Char handling is now sound enought that they became uneccessary
2014-07-21ignore-lexer-test to broken files and remove some tray hyphensCorey Richardson-0/+2
I blame @ChrisMorgan for the hyphens.
2014-07-09auto merge of #15283 : kwantam/rust/master, r=alexcrichtonbors-0/+27
Add libunicode; move unicode functions from core - created new crate, libunicode, below libstd - split `Char` trait into `Char` (libcore) and `UnicodeChar` (libunicode) - Unicode-aware functions now live in libunicode - `is_alphabetic`, `is_XID_start`, `is_XID_continue`, `is_lowercase`, `is_uppercase`, `is_whitespace`, `is_alphanumeric`, `is_control`, `is_digit`, `to_uppercase`, `to_lowercase` - added `width` method in UnicodeChar trait - determines printed width of character in columns, or None if it is a non-NULL control character - takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise) - split `StrSlice` into `StrSlice` (libcore) and `UnicodeStrSlice` (libunicode) - functionality formerly in `StrSlice` that relied upon Unicode functionality from `Char` is now in `UnicodeStrSlice` - `words`, `is_whitespace`, `is_alphanumeric`, `trim`, `trim_left`, `trim_right` - also moved `Words` type alias into libunicode because `words` method is in `UnicodeStrSlice` - unified Unicode tables from libcollections, libcore, and libregex into libunicode - updated `unicode.py` in `src/etc` to generate aforementioned tables - generated new tables based on latest Unicode data - added `UnicodeChar` and `UnicodeStrSlice` traits to prelude - libunicode is now the collection point for the `std::char` module, combining the libunicode functionality with the `Char` functionality from libcore - thus, moved doc comment for `char` from `core::char` to `unicode::char` - libcollections remains the collection point for `std::str` The Unicode-aware functions that previously lived in the `Char` and `StrSlice` traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and `use` the `UnicodeChar` and/or `UnicodeStrSlice` traits: extern crate unicode; use unicode::UnicodeChar; use unicode::UnicodeStrSlice; use unicode::Words; // if you want to use the words() method NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude. closes #15224 [breaking-change]
2014-07-08std: Rename the `ToStr` trait to `ToString`, and `to_str` to `to_string`.Richo Healey-6/+0
[breaking-change]
2014-07-07Add libunicode; move unicode functions from corekwantam-0/+27
- created new crate, libunicode, below libstd - split Char trait into Char (libcore) and UnicodeChar (libunicode) - Unicode-aware functions now live in libunicode - is_alphabetic, is_XID_start, is_XID_continue, is_lowercase, is_uppercase, is_whitespace, is_alphanumeric, is_control, is_digit, to_uppercase, to_lowercase - added width method in UnicodeChar trait - determines printed width of character in columns, or None if it is a non-NULL control character - takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise) - split StrSlice into StrSlice (libcore) and UnicodeStrSlice (libunicode) - functionality formerly in StrSlice that relied upon Unicode functionality from Char is now in UnicodeStrSlice - words, is_whitespace, is_alphanumeric, trim, trim_left, trim_right - also moved Words type alias into libunicode because words method is in UnicodeStrSlice - unified Unicode tables from libcollections, libcore, and libregex into libunicode - updated unicode.py in src/etc to generate aforementioned tables - generated new tables based on latest Unicode data - added UnicodeChar and UnicodeStrSlice traits to prelude - libunicode is now the collection point for the std::char module, combining the libunicode functionality with the Char functionality from libcore - thus, moved doc comment for char from core::char to unicode::char - libcollections remains the collection point for std::str The Unicode-aware functions that previously lived in the Char and StrSlice traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and use the UnicodeChar and/or UnicodeStrSlice traits: extern crate unicode; use unicode::UnicodeChar; use unicode::UnicodeStrSlice; use unicode::Words; // if you want to use the words() method NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude. closes #15224 [breaking-change]