about summary refs log tree commit diff
path: root/src
diff options
context:
space:
mode:
authorAlex Crichton <alex@alexcrichton.com>2015-02-02 11:01:15 -0800
committerAlex Crichton <alex@alexcrichton.com>2015-02-02 11:01:15 -0800
commit747e6b53e4b0e71b18c9941409b52144c514ac4e (patch)
treed05345d7363b4655fbc834bfecfc6f395ef1efa3 /src
parent7335c7dd63cafe70ffca76677f9e33bc6eccefaa (diff)
parentb796c1d6141b0677b2d2401cf65215b5438901ef (diff)
downloadrust-747e6b53e4b0e71b18c9941409b52144c514ac4e.tar.gz
rust-747e6b53e4b0e71b18c9941409b52144c514ac4e.zip
rollup merge of #21832: genbattle/doc-unicode-escapes
Unicode escapes were changed in [this RFC](https://github.com/rust-lang/rfcs/blob/28aeb3c391c9afd344f124d3a69bdc2a420638b2/text/0446-es6-unicode-escapes.md) to use the ES6 \u{00FFFF} syntax with a variable number of digits from 1-6, eliminating the need for two different syntaxes for unicode literals.

I have updated The Reference and grammar.md to reflect these changes.
Diffstat (limited to 'src')
-rw-r--r--src/doc/grammar.md3
-rw-r--r--src/doc/reference.md16
2 files changed, 7 insertions, 12 deletions
diff --git a/src/doc/grammar.md b/src/doc/grammar.md
index c2cbb3ae3fb..59a1c8f828b 100644
--- a/src/doc/grammar.md
+++ b/src/doc/grammar.md
@@ -196,8 +196,7 @@ raw_string : '"' raw_string_body '"' | '#' raw_string '#' ;
 common_escape : '\x5c'
               | 'n' | 'r' | 't' | '0'
               | 'x' hex_digit 2
-unicode_escape : 'u' hex_digit 4
-               | 'U' hex_digit 8 ;
+unicode_escape : 'u' '{' hex_digit+ 6 '}';
 
 hex_digit : 'a' | 'b' | 'c' | 'd' | 'e' | 'f'
           | 'A' | 'B' | 'C' | 'D' | 'E' | 'F'
diff --git a/src/doc/reference.md b/src/doc/reference.md
index d7cc826d10b..64ddb3ffdd3 100644
--- a/src/doc/reference.md
+++ b/src/doc/reference.md
@@ -250,8 +250,7 @@ cases mentioned in [Number literals](#number-literals) below.
 ##### Unicode escapes
 |   | Name |
 |---|------|
-| `\u7FFF` | 16-bit character code (exactly 4 digits) |
-| `\U7EEEFFFF` | 32-bit character code (exactly 8 digits) |
+| `\u{7FFF}` | 24-bit Unicode character code (up to 6 digits) |
 
 ##### Numbers
 
@@ -286,8 +285,8 @@ raw_string : '"' raw_string_body '"' | '#' raw_string '#' ;
 common_escape : '\x5c'
               | 'n' | 'r' | 't' | '0'
               | 'x' hex_digit 2
-unicode_escape : 'u' hex_digit 4
-               | 'U' hex_digit 8 ;
+
+unicode_escape : 'u' '{' hex_digit+ 6 '}';
 
 hex_digit : 'a' | 'b' | 'c' | 'd' | 'e' | 'f'
           | 'A' | 'B' | 'C' | 'D' | 'E' | 'F'
@@ -320,12 +319,9 @@ following forms:
 * An _8-bit codepoint escape_ escape starts with `U+0078` (`x`) and is
   followed by exactly two _hex digits_. It denotes the Unicode codepoint
   equal to the provided hex value.
-* A _16-bit codepoint escape_ starts with `U+0075` (`u`) and is followed
-  by exactly four _hex digits_. It denotes the Unicode codepoint equal to
-  the provided hex value.
-* A _32-bit codepoint escape_ starts with `U+0055` (`U`) and is followed
-  by exactly eight _hex digits_. It denotes the Unicode codepoint equal to
-  the provided hex value.
+* A _24-bit codepoint escape_ starts with `U+0075` (`u`) and is followed
+  by up to six _hex digits_ surrounded by braces `U+007B` (`{`) and `U+007D`
+  (`}`). It denotes the Unicode codepoint equal to the provided hex value.
 * A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
   (`r`), or `U+0074` (`t`), denoting the unicode values `U+000A` (LF),
   `U+000D` (CR) or `U+0009` (HT) respectively.