<feed xmlns='http://www.w3.org/2005/Atom'>
<title>rust/src/etc/unicode.py, branch 1.3.0</title>
<subtitle>https://github.com/rust-lang/rust
</subtitle>
<id>http://git.dreamy.place/mirrors/rust/atom?h=1.3.0</id>
<link rel='self' href='http://git.dreamy.place/mirrors/rust/atom?h=1.3.0'/>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/'/>
<updated>2015-06-25T05:16:25+00:00</updated>
<entry>
<title>Remove char::to_titlecase. Fix #26555</title>
<updated>2015-06-25T05:16:25+00:00</updated>
<author>
<name>Simon Sapin</name>
<email>simon.sapin@exyr.org</email>
</author>
<published>2015-06-25T05:14:27+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=32b7b50bafd0d40fda1caa93cf7f7068cc5052e3'/>
<id>urn:sha1:32b7b50bafd0d40fda1caa93cf7f7068cc5052e3</id>
<content type='text'>
I added it because it was easy (same a `char::to_lowercase`,
just a different table), but it doesn’t make sense to have this
in std but not str::to_titlecase, which would require
https://github.com/unicode-rs/unicode-segmentation

At some point in the future this feature will be available
(both on char and str) in a crates.io crate.
</content>
</entry>
<entry>
<title>Correctly map upper-case Sigma to lower-case in word-final position. Fix #26035.</title>
<updated>2015-06-06T10:37:11+00:00</updated>
<author>
<name>Simon Sapin</name>
<email>simon.sapin@exyr.org</email>
</author>
<published>2015-06-06T10:34:24+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=f901086b0db092f331b9555199298c58d685f668'/>
<id>urn:sha1:f901086b0db092f331b9555199298c58d685f668</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Add char::to_titlecase</title>
<updated>2015-06-06T10:37:11+00:00</updated>
<author>
<name>Simon Sapin</name>
<email>simon.sapin@exyr.org</email>
</author>
<published>2015-06-05T17:20:09+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=d316487ec1870956b5ba13468c39b61577e6858f'/>
<id>urn:sha1:d316487ec1870956b5ba13468c39b61577e6858f</id>
<content type='text'>
But not str::to_titlecase which would require UAX#29 Unicode Text Segmentation
which we decided not to include in of `std`:
https://github.com/rust-lang/rfcs/pull/1054
</content>
</entry>
<entry>
<title>Add complex (but unconditional) Unicode case mapping. Fix #25800</title>
<updated>2015-06-06T10:37:10+00:00</updated>
<author>
<name>Simon Sapin</name>
<email>simon.sapin@exyr.org</email>
</author>
<published>2015-06-05T15:40:09+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=addaa5b1ff0d611b6568ce5fb0c6469a8e1a6ee4'/>
<id>urn:sha1:addaa5b1ff0d611b6568ce5fb0c6469a8e1a6ee4</id>
<content type='text'>
As a result, the iterator returned by `char::to_uppercase` sometimes
yields two or three `char`s instead of just one.
</content>
</entry>
<entry>
<title>to_lowercase/to_uppercase: also map chars not in Lu/Ll categories.</title>
<updated>2015-06-06T10:37:10+00:00</updated>
<author>
<name>Simon Sapin</name>
<email>simon.sapin@exyr.org</email>
</author>
<published>2015-06-05T14:23:51+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=66af12721a3200f872adf38e0015e22db88cd86e'/>
<id>urn:sha1:66af12721a3200f872adf38e0015e22db88cd86e</id>
<content type='text'>
This adds 120 mappings:

ǅ ǆ
ǅ Ǆ
ǈ ǉ
ǈ Ǉ
ǋ ǌ
ǋ Ǌ
ǲ ǳ
ǲ Ǳ
 Ι
ᾈ ᾀ
ᾉ ᾁ
ᾊ ᾂ
ᾋ ᾃ
ᾌ ᾄ
ᾍ ᾅ
ᾎ ᾆ
ᾏ ᾇ
ᾘ ᾐ
ᾙ ᾑ
ᾚ ᾒ
ᾛ ᾓ
ᾜ ᾔ
ᾝ ᾕ
ᾞ ᾖ
ᾟ ᾗ
ᾨ ᾠ
ᾩ ᾡ
ᾪ ᾢ
ᾫ ᾣ
ᾬ ᾤ
ᾭ ᾥ
ᾮ ᾦ
ᾯ ᾧ
ᾼ ᾳ
ῌ ῃ
ῼ ῳ
Ⅰ ⅰ
Ⅱ ⅱ
Ⅲ ⅲ
Ⅳ ⅳ
Ⅴ ⅴ
Ⅵ ⅵ
Ⅶ ⅶ
Ⅷ ⅷ
Ⅸ ⅸ
Ⅹ ⅹ
Ⅺ ⅺ
Ⅻ ⅻ
Ⅼ ⅼ
Ⅽ ⅽ
Ⅾ ⅾ
Ⅿ ⅿ
ⅰ Ⅰ
ⅱ Ⅱ
ⅲ Ⅲ
ⅳ Ⅳ
ⅴ Ⅴ
ⅵ Ⅵ
ⅶ Ⅶ
ⅷ Ⅷ
ⅸ Ⅸ
ⅹ Ⅹ
ⅺ Ⅺ
ⅻ Ⅻ
ⅼ Ⅼ
ⅽ Ⅽ
ⅾ Ⅾ
ⅿ Ⅿ
Ⓐ ⓐ
Ⓑ ⓑ
Ⓒ ⓒ
Ⓓ ⓓ
Ⓔ ⓔ
Ⓕ ⓕ
Ⓖ ⓖ
Ⓗ ⓗ
Ⓘ ⓘ
Ⓙ ⓙ
Ⓚ ⓚ
Ⓛ ⓛ
Ⓜ ⓜ
Ⓝ ⓝ
Ⓞ ⓞ
Ⓟ ⓟ
Ⓠ ⓠ
Ⓡ ⓡ
Ⓢ ⓢ
Ⓣ ⓣ
Ⓤ ⓤ
Ⓥ ⓥ
Ⓦ ⓦ
Ⓧ ⓧ
Ⓨ ⓨ
Ⓩ ⓩ
ⓐ Ⓐ
ⓑ Ⓑ
ⓒ Ⓒ
ⓓ Ⓓ
ⓔ Ⓔ
ⓕ Ⓕ
ⓖ Ⓖ
ⓗ Ⓗ
ⓘ Ⓘ
ⓙ Ⓙ
ⓚ Ⓚ
ⓛ Ⓛ
ⓜ Ⓜ
ⓝ Ⓝ
ⓞ Ⓞ
ⓟ Ⓟ
ⓠ Ⓠ
ⓡ Ⓡ
ⓢ Ⓢ
ⓣ Ⓣ
ⓤ Ⓤ
ⓥ Ⓥ
ⓦ Ⓦ
ⓧ Ⓧ
ⓨ Ⓨ
ⓩ Ⓩ
</content>
</entry>
<entry>
<title>optimize Unicode tables</title>
<updated>2015-04-18T17:20:57+00:00</updated>
<author>
<name>kwantam</name>
<email>kwantam@gmail.com</email>
</author>
<published>2015-04-16T19:38:35+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=f14d289d71fd8e4956e7214bda3af15cd50898fe'/>
<id>urn:sha1:f14d289d71fd8e4956e7214bda3af15cd50898fe</id>
<content type='text'>
Apply optimization described in
https://github.com/rust-lang/regex/pull/73#issuecomment-93777126
to rust's copy of `unicode.py`.

This shrinks librustc_unicode's tables.rs from 479kB to 456kB,
and should improve performance slightly for related operations
(e.g., is_alphabetic(), is_xid_start(), etc).

In addition, pull in fix from @dscorbett's commit
d25c39f86568a147f9b7080c25711fb1f98f056a in regex, which
makes `load_properties()` more tolerant of whitespace
in the Unicode tables. (This fix does not result in any
changes to tables.rs, but could if the Unicode tables
change in the future.)
</content>
</entry>
<entry>
<title>deprecate Unicode functions that will be moved to crates.io</title>
<updated>2015-04-16T21:03:05+00:00</updated>
<author>
<name>kwantam</name>
<email>kwantam@gmail.com</email>
</author>
<published>2015-04-14T19:52:37+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=29d1252e4d2126318d7f622505ed76dd1e8e4edc'/>
<id>urn:sha1:29d1252e4d2126318d7f622505ed76dd1e8e4edc</id>
<content type='text'>
This patch
1. renames libunicode to librustc_unicode,
2. deprecates several pieces of libunicode (see below), and
3. removes references to deprecated functions from
   librustc_driver and libsyntax. This may change pretty-printed
   output from these modules in cases involving wide or combining
   characters used in filenames, identifiers, etc.

The following functions are marked deprecated:

1. char.width() and str.width():
   --&gt; use unicode-width crate

2. str.graphemes() and str.grapheme_indices():
   --&gt; use unicode-segmentation crate

3. str.nfd_chars(), str.nfkd_chars(), str.nfc_chars(), str.nfkc_chars(),
   char.compose(), char.decompose_canonical(), char.decompose_compatible(),
   char.canonical_combining_class():
   --&gt; use unicode-normalization crate
</content>
</entry>
<entry>
<title>Remove regex module from libunicode</title>
<updated>2015-04-12T22:30:10+00:00</updated>
<author>
<name>Chris Wong</name>
<email>lambda.fairy@gmail.com</email>
</author>
<published>2015-04-12T01:24:19+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=5308ac939a330b74540bea5920b0086a2d954648'/>
<id>urn:sha1:5308ac939a330b74540bea5920b0086a2d954648</id>
<content type='text'>
The regex crate keeps its own tables now (rust-lang/regex#41) so we
don't need them here.

[breaking-change]
</content>
</entry>
<entry>
<title>use normative source for Grapheme class data</title>
<updated>2015-04-06T23:46:48+00:00</updated>
<author>
<name>kwantam</name>
<email>kwantam@gmail.com</email>
</author>
<published>2015-04-06T23:42:18+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=bef00ab2b82f75e267a3bf19e511f21e41e41b9a'/>
<id>urn:sha1:bef00ab2b82f75e267a3bf19e511f21e41e41b9a</id>
<content type='text'>
@mahkoh points out in #15628 that unicode.py does not use
normative data for Grapheme classes. This pr fixes that issue.

In addition, GC_RegionalIndicator is renamed GC_Regional_Indicator
in order to stay in line with the Unicode class name definitions.
I have updated refs in u_str.rs, and verified that there are no
refs elsewhere in the codebase. However, in principle someone
using the unicode tables for their own purposes might see breakage
from this.
</content>
</entry>
<entry>
<title>unicode: Properly parse ranges in UnicodeData.txt</title>
<updated>2015-03-03T19:04:55+00:00</updated>
<author>
<name>Florian Zeitz</name>
<email>florob@babelmonkeys.de</email>
</author>
<published>2015-03-03T17:35:41+00:00</published>
<link rel='alternate' type='text/html' href='http://git.dreamy.place/mirrors/rust/commit/?id=c9e2de42b590c6d294afd1db44334c5168a694bb'/>
<id>urn:sha1:c9e2de42b590c6d294afd1db44334c5168a694bb</id>
<content type='text'>
This handles the ranges contained in UnicodeData.txt.
Counterintuitively this actually makes the tables shorter.
</content>
</entry>
</feed>
