rust - https://github.com/rust-lang/rust

Age	Commit message (Collapse)	Author	Lines
2019-09-06	it's more pythonic to use 'is not None' in python files	Guanqun Lu	-1/+1

2019-09-04	remove XID and Pattern_White_Space unicode tables from libcore	Aleksey Kladov	-385/+4
	They are only used by rustc_lexer, and are not needed elsewhere. So we move the relevant definitions into rustc_lexer (while the actual unicode data comes from the unicode-xid crate) and make the rest of the compiler use it.
2019-08-05	Make some items in core::unicode private	Matthew Jasper	-20/+20
	They were reachable through opaque macros defined in `core`
2019-07-26	Rollup merge of #62084 - euclio:unicode-table-tweak, r=kennytm	Mazdak Farrokhzad	-2/+2
	allow clippy::unreadable_literal in unicode tables Also modifies the generation script to emit 2018 edition paths.
2019-07-12	allow clippy::unreadable_literal in unicode tables	Andy Russell	-4/+4
	Also modifies the generation script to emit 2018 edition paths.
2019-07-12	Regenerate character tables for Unicode 12.1	Josh Stone	-734/+765

2019-07-12	Update unicode scripts for the current coding style	Josh Stone	-5/+5

2019-07-06	Rollup merge of #60081 - pawroman:cleanup_unicode_script, r=varkor	Mazdak Farrokhzad	-352/+740
	Refactor unicode.py script Hi, I noticed that the `unicode.py` script used some deprecated escapes in regular expressions. E.g. `\d`, `\w`, `\.` will be illegal in the future without "raw strings". This is now fixed. I have also cleaned up the script quite a bit. ## Escape deprecation OK (note the `r`): `re.compile(r"\d")` Deprecated (from Python 3.6 onwards, see [here][link1] and [here][link2]): `re.compile("\d")`. [link1]: https://docs.python.org/3.6/whatsnew/3.6.html#deprecated-python-behavior [link2]: https://bugs.python.org/issue27364 This was evident running the script using Python 3.7 like so: ``` $ python3 -Wall unicode.py unicode.py:227: DeprecationWarning: invalid escape sequence \w re1 = re.compile("^ ([0-9A-F]+) ; (\w+)") unicode.py:228: DeprecationWarning: invalid escape sequence \. re2 = re.compile("^ ([0-9A-F]+)\.\.([0-9A-F]+) ; (\w+)") unicode.py:453: DeprecationWarning: invalid escape sequence \d pattern = "for Version (\d+)\.(\d+)\.(\d+) of the Unicode" ``` The documentation states that > A backslash-character pair that is not a valid escape sequence now generates a DeprecationWarning. Although this will eventually become a SyntaxError, that will not be for several Python releases. ## Testing To test my changes, I had to add support for choosing the Unicode version to use. The script will default to latest release (which is 12.0.0 at the moment, repo has 11.0.0 checked in). The script generates the exact same output for version 11.0.0 with Python 2.7 and 3.7 and no longer generates any deprecation warnings: ``` $ python3 -Wall unicode.py -v 11.0.0 Using Unicode version: 11.0.0 Regenerated tables.rs. $ git diff tables.rs $ python2 -Wall unicode.py -v 11.0.0 Using Unicode version: 11.0.0 Regenerated tables.rs. $ git diff tables.rs $ python2 --version Python 2.7.16 $ python3 --version Python 3.7.3 ``` ## Extra functionality Furthermore, the script will check and download the latest Unicode version by default (without the `-v` argument). The `--help` is below: ``` $ ./unicode.py --help usage: unicode.py [-h] [-v VERSION] Regenerate Unicode tables (tables.rs). optional arguments: -h, --help show this help message and exit -v VERSION, --version VERSION Unicode version to use (if not specified, defaults to latest available final release). ``` ## Cleanups I have cleaned up the code quite a bit, with Python best practices and code style in mind. I'm happy to provide more details and rationale for all my changes if the reviewers so desire. One externally visible change is that the Unicode data will now be downloaded into `src/libcore/unicode/downloaded` directory suffixed by Unicode version: ``` $ pwd .../rust/src/libcore/unicode $ exa -T downloaded/ downloaded ├── 11.0.0 │ ├── DerivedCoreProperties.txt │ ├── DerivedNormalizationProps.txt │ ├── PropList.txt │ ├── ReadMe.txt │ ├── Scripts.txt │ ├── SpecialCasing.txt │ └── UnicodeData.txt └── 12.0.0 ├── DerivedCoreProperties.txt ├── DerivedNormalizationProps.txt ├── PropList.txt ├── ReadMe.txt ├── Scripts.txt ├── SpecialCasing.txt └── UnicodeData.txt ```
2019-07-01	Address review remarks in unicode.py	Paweł Romanowski	-55/+61

2019-06-10	Apply suggestions from code review	Paweł Romanowski	-4/+5
	Co-Authored-By: varkor <github@varkor.com>
2019-04-19	Refactor and document unicode.py script	Paweł Romanowski	-302/+518

2019-04-18	Fix tidy errors	Paweł Romanowski	-2/+3

2019-04-18	More cleanups for unicode.py	Paweł Romanowski	-25/+23

2019-04-18	Clean up unicode.py script	Paweł Romanowski	-103/+269

2019-04-18	libcore => 2018	Taiki Endo	-5/+5

2018-12-25	Remove licenses	Mark Rousskov	-90/+1

2018-12-04	cleanup: remove static lifetimes from consts	ljedrz	-6/+6

2018-11-10	revert making internal APIs const fn.	Mazdak Farrokhzad	-1/+1

2018-11-10	constify parts of libcore.	Mazdak Farrokhzad	-2/+1

2018-08-01	Auto merge of #51609 - dscorbett:is_numeric, r=alexcrichton	bors	-30/+44
	Treat gc=No characters as numeric [`char::is_numeric`](https://doc.rust-lang.org/std/primitive.char.html#method.is_numeric) and [`char::is_alphanumeric`](https://doc.rust-lang.org/std/primitive.char.html#method.is_alphanumeric) are documented to be defined “in terms of the Unicode General Categories 'Nd', 'Nl', 'No'”, but unicode.py does not group 'No' with the other 'N' categories. These functions therefore currently return `false` for characters like ⟨¾⟩ and ⟨①⟩.
2018-07-06	Handle array manually in string case conversion methods	Pazzaz	-0/+3

2018-06-17	Treat gc=No characters as numeric	David Corbett	-30/+44

2018-06-11	Regenerate character tables for Unicode 11	Josh Stone	-1120/+1214

2018-05-21	Fix tables.rs	varkor	-6/+45

2018-05-21	Avoid counting characters and add explanatory comment to test	varkor	-1/+1

2018-05-21	Use Grapheme_Extend instead of Mn	varkor	-166/+129

2018-05-21	Use the correct output directory for downloading Unicode files	varkor	-2/+1

2018-05-21	Escape combining characters in escape_debug	varkor	-1/+1

2018-05-21	Keep tables.rs copyright notice up to date	varkor	-5/+5

2018-05-21	Download unicode data files in directory of unicode.py	varkor	-7/+11

2018-05-21	Update unicode/tables.rs with Mn	varkor	-1/+121

2018-05-01	Fix a warning in libcore on 16bit targets.	Vadzim Dambrouski	-8/+8
	This code is assuming that usize >= 32bits, but it is not the case on 16bit targets. It is producing a warning that will fail the compilation on MSP430 if deny(warnings) is enabled. It is very unlikely that someone would actually use this code on a microcontroller, but since unicode was merged into libcore we have compile it on 16bit targets.
2018-04-12	Mark the rest of the `unicode` feature flag as perma-unstable.	Simon Sapin	-1/+1

2018-04-12	Dedicated tracking issue for UnicodeVersion and UNICODE_VERSION.	Simon Sapin	-0/+3

2018-04-12	Move core::char::printable to core::unicode::printable	Simon Sapin	-0/+786

2018-04-12	Merge unstable Utf16Encoder into EncodeUtf16	Simon Sapin	-58/+0

2018-04-12	Merge core::unicode::str into core::str	Simon Sapin	-188/+58
	And the UnicodeStr trait into StrExt
2018-04-12	Remove the CharExt trait, now that libcore has inherent methods for char	Simon Sapin	-6/+3

2018-04-12	Move the rest of core::unicode::char to core::unicode	Simon Sapin	-1438/+0

2018-04-12	Move char decoding iterators into a separate private module.	Simon Sapin	-129/+0

2018-04-12	Reexport from core::unicode::char in core::char rather than vice versa	Simon Sapin	-23/+4

2018-04-12	Move contents of libstd_unicode into libcore	Simon Sapin	-0/+4782