summary refs log tree commit diff
path: root/src/libstd_unicode
AgeCommit message (Collapse)AuthorLines
2017-09-18[libstd_unicode] Expose UnicodeVersion typeBehnam Esfahbod-1/+1
In <https://github.com/rust-lang/rust/pull/42998>, we added an uninstantiable type for the internal `UNICODE_VERSION` value, `UnicodeVersion`, but it was not made public to the outside of the crate, resulting in the value becoming less useful. Here we make the type accessible from the outside. Also add a run-pass test to make sure the type and value can be accessed as intended.
2017-09-03impl Debug for SplitWhitespace.Clar Charr-1/+1
2017-08-25*: remove crate_{name,type} attributesTamir Duberstein-2/+0
Fixes #41701.
2017-08-23Auto merge of #43830 - alexcrichton:path-display-regression, r=aturonbors-3/+16
std: Respect formatting flags for str-like OsStr Historically many `Display` and `Debug` implementations for `OsStr`-like abstractions have gone through `String::from_utf8_lossy`, but this was updated in #42613 to use an internal `Utf8Lossy` abstraction instead. This had the unfortunate side effect of causing a regression (#43765) in code which relied on these `fmt` trait implementations respecting the various formatting flags specified. This commit opportunistically adds back interpretation of formatting trait flags in the "common case" where where `OsStr`-like "thing" is all valid utf-8 and can delegate to the formatting implementation for `str`. This doesn't entirely solve the regression as non-utf8 paths will format differently than they did before still (in that they will not respect formatting flags), but this should solve the regression for all "real world" use cases of paths and such. The door's also still open for handling these flags in the future! Closes #43765
2017-08-15use field init shorthand EVERYWHEREZack M. Davis-1/+1
Like #43008 (f668999), but _much more aggressive_.
2017-08-13std: Respect formatting flags for str-like OsStrAlex Crichton-3/+16
Historically many `Display` and `Debug` implementations for `OsStr`-like abstractions have gone through `String::from_utf8_lossy`, but this was updated in #42613 to use an internal `Utf8Lossy` abstraction instead. This had the unfortunate side effect of causing a regression (#43765) in code which relied on these `fmt` trait implementations respecting the various formatting flags specified. This commit opportunistically adds back interpretation of formatting trait flags in the "common case" where where `OsStr`-like "thing" is all valid utf-8 and can delegate to the formatting implementation for `str`. This doesn't entirely solve the regression as non-utf8 paths will format differently than they did before still (in that they will not respect formatting flags), but this should solve the regression for all "real world" use cases of paths and such. The door's also still open for handling these flags in the future! Closes #43765
2017-08-08Auto merge of #42998 - behnam:uni-ver-type, r=sfacklerbors-6/+52
[libstd_unicode] Change UNICODE_VERSION to use u32 Looks like there's no strong reason to keep these values at `u64`. With the current plans for the Unicode Standard, `u8` should be enough for the next 200 years. To stay on the safe side, I'm using `u16` here. I don't see a reason to go with anything machine-dependent/more-efficient.
2017-07-25std: Stabilize `char_escape_debug`Alex Crichton-5/+1
Stabilizes: * `<char>::escape_debug` * `std::char::EscapeDebug` Closes #35068
2017-07-21[libstd_unicode] Create UnicodeVersion typeBehnam Esfahbod-6/+52
Create named struct `UnicodeVersion` to use instead of tuple type for `UNICODE_VERSION` value. This allows user to access the fields with meaningful field names: `major`, `minor`, and `micro`. Per request, an empty private field is added to the struct, so it can be extended in the future without API breakage.
2017-07-21[libstd_unicode] Change UNICODE_VERSION to use u32Behnam Esfahbod-2/+2
Use `u32` for version components, as `u64` is just an overkill, and `u32` is the default type for integers and the default type used for regular internal numbers. There's no expectation for Unicode Versions to even reach one thousand in the next hundered years. This is different from *package versions*, which may become something auto-generated and exceed human-friendly range of integer values.
2017-07-10Correct some stability attributesOliver Middleton-1/+1
These show up in rustdoc so need to be correct.
2017-06-30[libstd_unicode] Upgrade to Unicode 10.0.0Behnam Esfahbod-155/+168
2017-06-20Rollup merge of #42271 - tinaun:charfromstr, r=alexcrichtonCorey Farwell-0/+2
add `FromStr` Impl for `char` fixes #24939. is it possible to use pub(restricted) instead of using a stability attribute for the internal error representation? is it needed at all?
2017-06-20added `FromStr` Impl for `char`tinaun-0/+2
2017-06-16Rollup merge of #42705 - est31:master, r=alexcrichtonCorey Farwell-4/+4
Introduce tidy lint to check for inconsistent tracking issues This PR * Refactors the collect_lib_features function to work in a non-checking mode (no bad pointer needed, and list of lang features). * Introduces checking whether unstable/stable tags for a given feature have inconsistent tracking issues, as in, multiple tracking issues per feature. * Fixes such inconsistencies throughout the codebase.
2017-06-16Introduce tidy lint to check for inconsistent tracking issuesest31-4/+4
This commit * Refactors the collect_lib_features function to work in a non-checking mode (no bad pointer needed, and list of lang features). * Introduces checking whether unstable/stable tags for a given feature have inconsistent tracking issues. * Fixes such inconsistencies throughout the codebase.
2017-06-15Utf8Lossy type with chunks iterator and impl Display and DebugStepan Koltsov-0/+311
2017-06-13Merge crate `collections` into `alloc`Murarth-2/+2
2017-05-10Auto merge of #41659 - bluss:clone-split-whitespace, r=aturonbors-12/+44
impl Clone for .split_whitespace() Use custom closure structs for the predicates so that the iterator's clone can simply be derived. This should also reduce virtual call overhead by not using function pointers. Fixes #41655
2017-05-04Move unicode Python script into libstd_unicode crate.Corey Farwell-1/+592
The only place this Python script is used is inside the libstd_unicode crate, so lets move it there.
2017-04-30std_unicode: Use #[inline] on the split_whitespace predicatesUlrik Sverdrup-0/+4
2017-04-30std_unicode: impl Clone for .split_whitespace()Ulrik Sverdrup-12/+40
Use custom closure structs for the predicates so that the iterator's clone can simply be derived. This should also reduce virtual call overhead by not using function pointers.
2017-03-30Remove parentheses in method referencesDonnie Bishop-2/+2
2017-03-30Revert SplitWhitespace's descriptionDonnie Bishop-1/+2
Original headline of SplitWhitespace's description is more descriptive as to what it contains and iterates over.
2017-03-30Modify SplitWhitespace's descriptionDonnie Bishop-2/+7
2017-03-25char::to_uppercase doc typo: use the 'an' article.Colin Wallace-1/+1
2017-03-25char::to_uppercase doc typo: s/lowercase/uppercase/Colin Wallace-1/+1
2017-03-17Rollup merge of #40499 - ericfindlay:master, r=steveklabnikCorey Farwell-2/+4
Corrected very minor documentation detail about Unicode and Japanese Japanese half-width and full-width romaji characters do have upper and lowercase according Unicode (but other Japanese characters do not). For example, ` assert_eq!('\u{FF21}'.to_lowercase().collect::<String>(),"\u{FF41}");` r? @steveklabnik
2017-03-15Ammended minor documentation detail abour Unicode cases.Eric Findlay-2/+4
2017-03-13Remove function invokation parens from documentation links.Corey Farwell-7/+7
This was never established as a convention we should follow in the 'More API Documentation Conventions' RFC: https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md
2017-03-14Corrected very minor documentation detail about Unicode and JapaneseEric Findlay-2/+2
2017-03-02Remove std_unicode::str::is_utf16Simon Sapin-23/+0
It was only accessible through the `#[unstable]` crate std_unicode. It has never been used in the compiler or standard library since 47e7a05a28c9662159af2d2e0f2b7efc13fa09cb added it in 2012 “for OS API interop”. It can be replaced with a one-liner: ```rust fn is_utf16(slice: &[u16]) -> bool { std::char::decode_utf16(s.iter().cloned()).all(|r| r.is_ok()) } ```
2017-03-01Only keep one copy of the UTF8_CHAR_WIDTH table.Simon Sapin-27/+1
… instead of one of each of libcore and libstd_unicode. Move the `utf8_char_width` function to `core::str` under the `str_internals` unstable feature.
2017-01-29Fix a few impl stability attributesOliver Middleton-3/+2
The versions show up in rustdoc.
2017-01-11Implement Display for char Escape*, To*case.Clar Charr-49/+150
2017-01-08Auto merge of #38679 - alexcrichton:always-deny-warnings, r=nrcbors-1/+1
Remove not(stage0) from deny(warnings) Historically this was done to accommodate bugs in lints, but there hasn't been a bug in a lint since this feature was added which the warnings affected. Let's completely purge warnings from all our stages by denying warnings in all stages. This will also assist in tracking down `stage0` code to be removed whenever we're updating the bootstrap compiler.
2017-01-03Reduce the size of static data in std_unicode::tables.Simon Sapin-208/+45
`BoolTrie` works well for sets of code points spread out through most of Unicode’s range, but is uses a lot of space for sets with few, mostly low, code points. This switches a few of its instances to a similar but simpler trie data structure. ## Before `size_of::<BoolTrie>()` is 1552, which is added to `table.r3.len() * 8 + t.r5.len() + t.r6.len() * 8`: * `Cc_table`: 1632 * `White_Space_table`: 1656 * `Pattern_White_Space_table`: 1640 * Total: 4928 bytes ## After `size_of::<SmallBoolTrie>()` is 32, which is added to `t.r1.len() + t.r2.len() * 8`: * `Cc_table`: 51 * `White_Space_table`: 273 * `Pattern_White_Space_table`: 193 * Total: 517 bytes ## Difference Every Rust program with `std` statically linked should be about 4 KB smaller.
2016-12-29Remove not(stage0) from deny(warnings)Alex Crichton-1/+1
Historically this was done to accommodate bugs in lints, but there hasn't been a bug in a lint since this feature was added which the warnings affected. Let's completely purge warnings from all our stages by denying warnings in all stages. This will also assist in tracking down `stage0` code to be removed whenever we're updating the bootstrap compiler.
2016-12-16Address falloutAaron Turon-3/+0
2016-12-15Stabilize std::char::{encode_utf8, encode_utf16}Aaron Turon-10/+2
2016-11-30Rename 'librustc_unicode' crate to 'libstd_unicode'.Corey Farwell-0/+3842
Fixes #26554.