diff options
| author | Simon Sapin <simon.sapin@exyr.org> | 2014-03-18 12:19:18 +0800 |
|---|---|---|
| committer | Alex Crichton <alex@alexcrichton.com> | 2014-03-18 13:48:06 -0700 |
| commit | 4ab95bcf20e8ecee4bef874bcf552829cfc9567c (patch) | |
| tree | 8fde8b90d2aeebe215e46e5a35edb2099aa29644 /src | |
| parent | 87c7c03f4585a35f1bc6d3e607a08e3beea48041 (diff) | |
| download | rust-4ab95bcf20e8ecee4bef874bcf552829cfc9567c.tar.gz rust-4ab95bcf20e8ecee4bef874bcf552829cfc9567c.zip | |
`char`: s/character/Unicode scalar value/
Tweak the definition of `char` to use the appropriate Unicode terminology.
Diffstat (limited to 'src')
| -rw-r--r-- | src/doc/rust.md | 8 |
1 files changed, 6 insertions, 2 deletions
diff --git a/src/doc/rust.md b/src/doc/rust.md index 7233288a813..39b62615536 100644 --- a/src/doc/rust.md +++ b/src/doc/rust.md @@ -3136,8 +3136,12 @@ machine. The types `char` and `str` hold textual data. -A value of type `char` is a Unicode character, -represented as a 32-bit unsigned word holding a UCS-4 codepoint. +A value of type `char` is a [Unicode scalar value]( +http://www.unicode.org/glossary/#unicode_scalar_value) +(ie. a code point that is not a surrogate), +represented as a 32-bit unsigned word in the 0x0000 to 0xD7FF +or 0xE000 to 0x10FFFF range. +A `[char]` vector is effectively an UCS-4 / UTF-32 string. A value of type `str` is a Unicode string, represented as a vector of 8-bit unsigned bytes holding a sequence of UTF-8 codepoints. |
