about summary refs log tree commit diff
path: root/src/libsyntax/parse/lexer
AgeCommit message (Collapse)AuthorLines
2016-05-05Auto merge of #33128 - xen0n:more-confusing-unicode-chars, r=nagisabors-6/+53
Add more aliases for Unicode confusable chars Building upon #29837, this PR: * added aliases for space characters, * distinguished square brackets from parens, and * added common CJK punctuation characters as aliases. This will especially help CJK users who may have forgotten to switch off IME when coding.
2016-05-02lexer: do not display char confusingly in error messageGeorg Brandl-5/+4
Current code leads to messages like "... use a \xHH escape: \u{e4}" which is confusing. The printed span already points to the offending character, which should be enough to identify the non-ASCII problem. Fixes: #29088
2016-04-27Make some fatal lexer errors recoverablemitaa-66/+103
2016-04-24syntax: Check paths in visibilities for type parametersVadim Petrochenkov-10/+4
syntax: Merge PathParsingMode::NoTypesAllowed and PathParsingMode::ImportPrefix syntax: Rename PathParsingMode and its variants to better express their purpose syntax: Remove obsolete error message about 'self lifetime syntax: Remove ALLOW_MODULE_PATHS workaround syntax/resolve: Adjust some error messages resolve: Compare unhygienic (not renamed) names with keywords::Invalid, invalid identifiers may appear to be valid after renaming
2016-04-24syntax: Get rid of token::IdentStyleVadim Petrochenkov-25/+10
2016-04-21add more confusable CJK square bracket aliasesWang Xuerui-0/+12
2016-04-21correct aliases for square bracketsWang Xuerui-6/+8
2016-04-21add confusable space charactersWang Xuerui-0/+17
2016-04-21add more characters easily inputtable with CJK IMEsWang Xuerui-0/+16
2016-03-07Auto merge of #29734 - Ryman:whitespace_consistency, r=Aatchbors-12/+12
libsyntax: be more accepting of whitespace in lexer Fixes #29590. Perhaps this may need more thorough testing? r? @Aatch
2016-01-16libsyntax: accept only whitespace with the PATTERN_WHITE_SPACE propertyKevin Butler-9/+12
This aligns with unicode recommendations and should be stable for all future unicode releases. See http://unicode.org/reports/tr31/#R3. This renames `libsyntax::lexer::is_whitespace` to `is_pattern_whitespace` so potentially breaks users of libsyntax.
2016-01-14libsyntax: use char::is_whitespace instead of custom implementationsKevin Butler-4/+1
Fixes #29590.
2016-01-14Display better snippet for invalid char literalGreg Chapple-10/+15
Given this code: fn main() { let _ = 'abcd'; } The compiler would give a message like: error: character literal may only contain one codepoint: '; let _ = 'abcd'; ^~ With this change, the message now displays: error: character literal may only contain one codepoint: 'abcd' let _ = 'abcd' ^~~~~~ Fixes #30033
2016-01-12re-instate comment that was mysteriously disappearedTshepang Lekhonkhobe-0/+1
2016-01-04address review commentTshepang Lekhonkhobe-10/+2
2016-01-03fix "make tidy" failureTshepang Lekhonkhobe-1/+2
2016-01-03run rustfmt on syntax::parse::lexerTshepang Lekhonkhobe-513/+764
2015-12-30use structured errorsNick Cameron-23/+47
2015-12-17test errorsNick Cameron-27/+51
2015-12-17move error handling from libsyntax/diagnostics.rs to libsyntax/errors/*Nick Cameron-11/+10
Also split out emitters into their own module.
2015-11-18Add some unicode aliases for ".Huon Wilson-0/+17
2015-11-17Detect confusing unicode characters and show the alternativeRavi Shankar-1/+189
2015-11-05Improve error message for char literalsSteve Klabnik-5/+6
If you try to put something that's bigger than a char into a char literal, you get an error: fn main() { let c = 'ஶ்ரீ'; } error: unterminated character constant: This is a very compiler-centric message. Yes, it's technically 'unterminated', but that's not what you, the user did wrong. Instead, this commit changes it to error: character literal may only contain one codepoint As this actually tells you what went wrong. Fixes #28851
2015-10-27Start pushing panics outward in lexer.Eli Friedman-34/+38
2015-10-10Prevent `/**/` from being parsed as a doc commentBarosl Lee-2/+3
Previously, `/**/` was incorrectly regarded as a doc comment because it starts with `/**` and ends with `*/`. However, this caused an ICE because some code assumed that the length of a doc comment is at least 5. This commit adds an additional check to `is_block_doc_comment` that tests the length of the input. Fixes #28844.
2015-10-01Stop re-exporting AttrStyle's variants and rename them.Ms2ger-2/+2
2015-09-03std: Account for CRLF in {str, BufRead}::linesAlex Crichton-1/+1
This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of the `str::lines` and `BufRead::lines` iterators. Both iterators now account for `\r\n` sequences in addition to `\n`, allowing for less surprising behavior across platforms (especially in the `BufRead` case). Splitting *only* on the `\n` character can still be achieved with `split('\n')` in both cases. The `str::lines_any` function is also now deprecated as `str::lines` is a drop-in replacement for it. [rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.md Closes #28032
2015-09-03Use consistent terminology for byte string literalsVadim Petrochenkov-4/+4
Avoid confusion with binary integer literals and binary operator expressions in libsyntax
2015-07-29Replace illegal with invalid in most diagnosticsSimonas Kazlauskas-8/+8
2015-07-21Use a span from the correct file for the inner span of a moduleNick Cameron-0/+1
This basically only affects modules which are empty (or only contain comments). Closes #26755
2015-07-13Auto merge of #26947 - nagisa:unicode-escape-error, r=nrcbors-17/+27
Inspired by the now-mysteriously-closed https://github.com/rust-lang/rust/pull/26782. This PR introduces better error messages when unicode escapes have invalid format (e.g. `\uFFFF`). It also makes rustc always tell the user that escape may not be used in byte-strings and bytes and fixes some spans to not include unecessary characters and include escape backslash in some others.
2015-07-13Tell unicode escapes can’t be used as bytes earlier/moreSimonas Kazlauskas-16/+14
2015-07-10Change some instances of .connect() to .join()Wesley Wiser-1/+1
2015-07-10Improve some of the string escape diagnostic spansSimonas Kazlauskas-6/+4
2015-07-10Improve incomplete unicode escape reportingSimonas Kazlauskas-5/+19
This improves diagnostic messages when \u escape is used incorrectly and { is missing. Instead of saying “unknown character escape: u”, it will now report that unicode escape sequence is incomplete and suggest what the correct syntax is.
2015-07-01Make the unused_mut lint smarter with respect to locals.Ariel Ben-Yehuda-1/+1
Fixes #26332
2015-06-22fix minor indentation issuesYongqian Li-13/+13
2015-05-17Auto merge of #25387 - eddyb:syn-file-loader, r=nikomatsakisbors-2/+2
This allows compiling entire crates from memory or preprocessing source files before they are tokenized. Minor API refactoring included, which is a [breaking-change] for libsyntax users: * `ParseSess::{next_node_id, reserve_node_ids}` moved to rustc's `Session` * `new_parse_sess` -> `ParseSess::new` * `new_parse_sess_special_handler` -> `ParseSess::with_span_handler` * `mk_span_handler` -> `SpanHandler::new` * `default_handler` -> `Handler::new` * `mk_handler` -> `Handler::with_emitter` * `string_to_filemap(sess source, path)` -> `sess.codemap().new_filemap(path, source)`
2015-05-14Fix stupid mistake from previous commitLee Jeffery-2/+5
2015-05-14Make BytePos calculation same as originalLee Jeffery-1/+2
2015-05-14syntax: refactor (Span)Handler and ParseSess constructors to be methods.Eduard Burtescu-2/+2
2015-05-13Added test to check that newlines are stripped from commentsLee Jeffery-0/+9
2015-05-13Fix byte offset and error message inconsistenciesLee Jeffery-3/+3
2015-05-08Fix CRLF line-ending parsing for comments.Lee Jeffery-24/+26
2015-04-25Interpolate AST nodes in quasiquote.Geoffry Song-107/+0
This changes the `ToTokens` implementations for expressions, statements, etc. with almost-trivial ones that produce `Interpolated(*Nt(...))` pseudo-tokens. In this way, quasiquote now works the same way as macros do: already-parsed AST fragments are used as-is, not reparsed. The `ToSource` trait is removed. Quasiquote no longer involves pretty-printing at all, which removes the need for the `encode_with_hygiene` hack. All associated machinery is removed. A new `Nonterminal` is added, NtArm, which the parser now interpolates. This is just for quasiquote, not macros (although it could be in the future). `ToTokens` is no longer implemented for `Arg` (although this could be added again) and `Generics` (which I don't think makes sense). This breaks any compiler extensions that relied on the ability of `ToTokens` to turn AST fragments back into inspectable token trees. For this reason, this closes #16987. As such, this is a [breaking-change]. Fixes #16472. Fixes #15962. Fixes #17397. Fixes #16617.
2015-04-24Change name of unit test sub-module to "tests".Johannes Oertel-2/+2
Changes the style guidelines regarding unit tests to recommend using a sub-module named "tests" instead of "test" for unit tests as "test" might clash with imports of libtest.
2015-04-21syntax: Copy unstable str::char_at into libsyntaxErick Tryzelaar-12/+14
2015-04-21syntax: Replace String::from_str with the stable String::fromErick Tryzelaar-1/+1
2015-04-21syntax: remove uses of `.into_cow()`Erick Tryzelaar-4/+4
2015-04-14Negative case of `len()` -> `is_empty()`Tamir Duberstein-2/+2
`s/([^\(\s]+\.)len\(\) [(?:!=)>] 0/!$1is_empty()/g`