| Age | Commit message (Collapse) | Author | Lines |
|
Add more aliases for Unicode confusable chars
Building upon #29837, this PR:
* added aliases for space characters,
* distinguished square brackets from parens, and
* added common CJK punctuation characters as aliases.
This will especially help CJK users who may have forgotten to switch off IME when coding.
|
|
Current code leads to messages like "... use a \xHH escape: \u{e4}"
which is confusing.
The printed span already points to the offending character, which
should be enough to identify the non-ASCII problem.
Fixes: #29088
|
|
|
|
syntax: Merge PathParsingMode::NoTypesAllowed and PathParsingMode::ImportPrefix
syntax: Rename PathParsingMode and its variants to better express their purpose
syntax: Remove obsolete error message about 'self lifetime
syntax: Remove ALLOW_MODULE_PATHS workaround
syntax/resolve: Adjust some error messages
resolve: Compare unhygienic (not renamed) names with keywords::Invalid, invalid identifiers may appear to be valid after renaming
|
|
|
|
|
|
|
|
|
|
|
|
libsyntax: be more accepting of whitespace in lexer
Fixes #29590.
Perhaps this may need more thorough testing?
r? @Aatch
|
|
This aligns with unicode recommendations and should be stable for all future
unicode releases. See http://unicode.org/reports/tr31/#R3.
This renames `libsyntax::lexer::is_whitespace` to `is_pattern_whitespace`
so potentially breaks users of libsyntax.
|
|
Fixes #29590.
|
|
Given this code:
fn main() {
let _ = 'abcd';
}
The compiler would give a message like:
error: character literal may only contain one codepoint: ';
let _ = 'abcd';
^~
With this change, the message now displays:
error: character literal may only contain one codepoint: 'abcd'
let _ = 'abcd'
^~~~~~
Fixes #30033
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Also split out emitters into their own module.
|
|
|
|
|
|
If you try to put something that's bigger than a char into a char
literal, you get an error:
fn main() {
let c = 'ஶ்ரீ';
}
error: unterminated character constant:
This is a very compiler-centric message. Yes, it's technically
'unterminated', but that's not what you, the user did wrong.
Instead, this commit changes it to
error: character literal may only contain one codepoint
As this actually tells you what went wrong.
Fixes #28851
|
|
|
|
Previously, `/**/` was incorrectly regarded as a doc comment because it
starts with `/**` and ends with `*/`. However, this caused an ICE
because some code assumed that the length of a doc comment is at least
5. This commit adds an additional check to `is_block_doc_comment` that
tests the length of the input.
Fixes #28844.
|
|
|
|
This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of
the `str::lines` and `BufRead::lines` iterators. Both iterators now account for
`\r\n` sequences in addition to `\n`, allowing for less surprising behavior
across platforms (especially in the `BufRead` case). Splitting *only* on the
`\n` character can still be achieved with `split('\n')` in both cases.
The `str::lines_any` function is also now deprecated as `str::lines` is a
drop-in replacement for it.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.md
Closes #28032
|
|
Avoid confusion with binary integer literals and binary operator expressions in libsyntax
|
|
|
|
This basically only affects modules which are empty (or only contain comments).
Closes #26755
|
|
Inspired by the now-mysteriously-closed https://github.com/rust-lang/rust/pull/26782.
This PR introduces better error messages when unicode escapes have invalid format (e.g. `\uFFFF`). It also makes rustc always tell the user that escape may not be used in byte-strings and bytes and fixes some spans to not include unecessary characters and include escape backslash in some others.
|
|
|
|
|
|
|
|
This improves diagnostic messages when \u escape is used incorrectly and { is
missing. Instead of saying “unknown character escape: u”, it will now report
that unicode escape sequence is incomplete and suggest what the correct syntax
is.
|
|
Fixes #26332
|
|
|
|
This allows compiling entire crates from memory or preprocessing source files before they are tokenized.
Minor API refactoring included, which is a [breaking-change] for libsyntax users:
* `ParseSess::{next_node_id, reserve_node_ids}` moved to rustc's `Session`
* `new_parse_sess` -> `ParseSess::new`
* `new_parse_sess_special_handler` -> `ParseSess::with_span_handler`
* `mk_span_handler` -> `SpanHandler::new`
* `default_handler` -> `Handler::new`
* `mk_handler` -> `Handler::with_emitter`
* `string_to_filemap(sess source, path)` -> `sess.codemap().new_filemap(path, source)`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This changes the `ToTokens` implementations for expressions, statements,
etc. with almost-trivial ones that produce `Interpolated(*Nt(...))`
pseudo-tokens. In this way, quasiquote now works the same way as macros
do: already-parsed AST fragments are used as-is, not reparsed.
The `ToSource` trait is removed. Quasiquote no longer involves
pretty-printing at all, which removes the need for the
`encode_with_hygiene` hack. All associated machinery is removed.
A new `Nonterminal` is added, NtArm, which the parser now interpolates.
This is just for quasiquote, not macros (although it could be in the
future).
`ToTokens` is no longer implemented for `Arg` (although this could be
added again) and `Generics` (which I don't think makes sense).
This breaks any compiler extensions that relied on the ability of
`ToTokens` to turn AST fragments back into inspectable token trees. For
this reason, this closes #16987.
As such, this is a [breaking-change].
Fixes #16472.
Fixes #15962.
Fixes #17397.
Fixes #16617.
|
|
Changes the style guidelines regarding unit tests to recommend using a
sub-module named "tests" instead of "test" for unit tests as "test"
might clash with imports of libtest.
|
|
|
|
|
|
|
|
`s/([^\(\s]+\.)len\(\) [(?:!=)>] 0/!$1is_empty()/g`
|