summary refs log tree commit diff
path: root/src/etc/unicode.py
AgeCommit message (Collapse)AuthorLines
2014-08-30Unify non-snake-case lints and non-uppercase statics lintsP1start-1/+1
This unifies the `non_snake_case_functions` and `uppercase_variables` lints into one lint, `non_snake_case`. It also now checks for non-snake-case modules. This also extends the non-camel-case types lint to check type parameters, and merges the `non_uppercase_pattern_statics` lint into the `non_uppercase_statics` lint. Because the `uppercase_variables` lint is now part of the `non_snake_case` lint, all non-snake-case variables that start with lowercase characters (such as `fooBar`) will now trigger the `non_snake_case` lint. New code should be updated to use the new `non_snake_case` lint instead of the previous `non_snake_case_functions` and `uppercase_variables` lints. All use of the `non_uppercase_pattern_statics` should be replaced with the `non_uppercase_statics` lint. Any code that previously contained non-snake-case module or variable names should be updated to use snake case names or disable the `non_snake_case` lint. Any code with non-camel-case type parameters should be changed to use camel case or disable the `non_camel_case_types` lint. [breaking-change]
2014-08-13core: Add binary_search and binary_search_elem methods to slices.Brian Anderson-21/+25
These are like the existing bsearch methods but if the search fails, it returns the next insertion point. The new `binary_search` returns a `BinarySearchResult` that is either `Found` or `NotFound`. For convenience, the `found` and `not_found` methods convert to `Option`, ala `Result`. Deprecate bsearch and bsearch_elem.
2014-07-28collections, unicode: Add support for NFC and NFKCFlorian Zeitz-2/+33
2014-07-14add Graphemes iterator; tidy unicode exportskwantam-5/+124
- Graphemes and GraphemeIndices structs implement iterators over grapheme clusters analogous to the Chars and CharOffsets for chars in a string. Iterator and DoubleEndedIterator are available for both. - tidied up the exports for libunicode. crate root exports are now moved into more appropriate module locations: - UnicodeStrSlice, Words, Graphemes, GraphemeIndices are in str module - UnicodeChar exported from char instead of crate root - canonical_combining_class is exported from str rather than crate root Since libunicode's exports have changed, programs that previously relied on the old export locations will need to change their `use` statements to reflect the new ones. See above for more information on where the new exports live. closes #7043 [breaking-change]
2014-07-07Add libunicode; move unicode functions from corekwantam-285/+351
- created new crate, libunicode, below libstd - split Char trait into Char (libcore) and UnicodeChar (libunicode) - Unicode-aware functions now live in libunicode - is_alphabetic, is_XID_start, is_XID_continue, is_lowercase, is_uppercase, is_whitespace, is_alphanumeric, is_control, is_digit, to_uppercase, to_lowercase - added width method in UnicodeChar trait - determines printed width of character in columns, or None if it is a non-NULL control character - takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise) - split StrSlice into StrSlice (libcore) and UnicodeStrSlice (libunicode) - functionality formerly in StrSlice that relied upon Unicode functionality from Char is now in UnicodeStrSlice - words, is_whitespace, is_alphanumeric, trim, trim_left, trim_right - also moved Words type alias into libunicode because words method is in UnicodeStrSlice - unified Unicode tables from libcollections, libcore, and libregex into libunicode - updated unicode.py in src/etc to generate aforementioned tables - generated new tables based on latest Unicode data - added UnicodeChar and UnicodeStrSlice traits to prelude - libunicode is now the collection point for the std::char module, combining the libunicode functionality with the Char functionality from libcore - thus, moved doc comment for char from core::char to unicode::char - libcollections remains the collection point for std::str The Unicode-aware functions that previously lived in the Char and StrSlice traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and use the UnicodeChar and/or UnicodeStrSlice traits: extern crate unicode; use unicode::UnicodeChar; use unicode::UnicodeStrSlice; use unicode::Words; // if you want to use the words() method NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude. closes #15224 [breaking-change]
2014-05-13std: Rename str::Normalizations to str::DecompositionsFlorian Zeitz-6/+6
The Normalizations iterator has been renamed to Decompositions. It does not currently include all forms of Unicode normalization, but only encompasses decompositions. If implemented recomposition would likely be a separate iterator which works on the result of this one. [breaking-change]
2014-05-13core: Move Hangul decomposition into unicode.rsFlorian Zeitz-19/+58
2014-05-13std, core: Generate unicode.rs using unicode.pyFlorian Zeitz-55/+76
2014-04-14Use new attribute syntax in python files in src/etc too (#13478)Manish Goregaokar-2/+2
2014-03-20rename std::vec -> std::sliceDaniel Micay-3/+3
Closes #12702
2014-03-13Remove code duplicationPiotr Zolnierek-27/+19
Remove whitespace Update documentation for to_uppercase, to_lowercase
2014-03-13Implement lower, upper case conversion for charPiotr Zolnierek-29/+74
2014-03-13std::unicode: remove unused category tablesPiotr Zolnierek-1/+4
2014-02-05etc: add missing license boilerplatesAdrien Tétar-1/+10
2013-11-27Fix handling of upper/lowercase, and whitespaceFlorian Zeitz-10/+12
2013-11-27Update unicode.py to reflect language changesFlorian Zeitz-5/+5
2013-09-09rename `std::iterator` to `std::iter`Daniel Micay-1/+1
The trait will keep the `Iterator` naming, but a more concise module name makes using the free functions less verbose. The module will define iterables in addition to iterators, as it deals with iteration in general.
2013-09-04stop treating char as an integer typeDaniel Micay-0/+1
Closes #7609
2013-08-21Add canonical combining class to std::unicodeFlorian Zeitz-4/+53
2013-08-21Add Unicode decomposition mappings to std::unicodeFlorian Zeitz-31/+99
2013-07-01rustc: add a lint to enforce uppercase statics.Huon Wilson-0/+1
2013-06-30Convert vec::{bsearch, bsearch_elem} to methods.Huon Wilson-2/+2
2013-06-30etc: update etc/unicode.py for the changes made to std::unicode.Huon Wilson-10/+24
2013-05-02Explain that the source code was generated by this scriptkud1ing-0/+4
2013-04-18core: replace unicode match exprs with bsearch in const arrays, minor perf win.Graydon Hoare-1/+44
2013-01-17Add a license check to tidy. #4018Brian Anderson-0/+1
2011-12-29Of course there were overlong lines.Graydon Hoare-4/+6
2011-12-29Teach unicode script to emit canonical and compat decomp mappings. ↵Graydon Hoare-46/+71
Annoyingly large encoding.
2011-12-23Add support to libcore for encoded-in-rust unicode character properties, at ↵Graydon Hoare-0/+172
least. Add script to compute them from unicode.org.