| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
In writing up the reference for frontmatter, I realized that we probably
shouldn't be accepting Unicode Line Ending characters between the code
fence and infostring or trailing after the infostring or a code fence.
In digging into the unicode specification we use for Whitespace, it
divides it up into categories, so I'm deferring to what it says for
horizontal whitespace for what should be used within a line.
Note, I am leaving out support for Unicde Default Ignorable characters.
I figure that can be discussed outside of this change within the
reference and tracking issue.
|
|
directly
|
|
|
|
1. Rename `make_unclosed_delims_error` and return `Vec<Diag>`
2. change magic number `unclosed_delimiter_show_limit` to const
3. move `eof_err` below parsing logic
4. Add `calculate_spacing` for `bump` and `bump_minimal`
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
|
|
faster string parsing
|
|
- The lint is now reported in code that gets removed/modified/duplicated
by macro expansion.
- Spans are more accurate
- Fixes #140281
|
|
Supercedes #137193
|
|
`rc""` more clear error message
here is small fix that provides better error message when user is trying to use `rc""` the same way it was made for `rb""`
example of it's work
```rust
|
2 | rc"\n";
| ^^ unknown prefix
|
= note: prefixed identifiers and literals are reserved since Rust 2021
help: use `cr` for a raw C-string
|
2 - rc"\n";
2 + cr"\n";
|
```
**related issue**
fixes #140170
cc `@cyrgani` (issue author)
|
|
|
|
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
|
|
By replacing them with `{Open,Close}{Param,Brace,Bracket,Invisible}`.
PR #137902 made `ast::TokenKind` more like `lexer::TokenKind` by
replacing the compound `BinOp{,Eq}(BinOpToken)` variants with fieldless
variants `Plus`, `Minus`, `Star`, etc. This commit does a similar thing
with delimiters. It also makes `ast::TokenKind` more similar to
`parser::TokenType`.
This requires a few new methods:
- `TokenKind::is_{,open_,close_}delim()` replace various kinds of
pattern matches.
- `Delimiter::as_{open,close}_token_kind` are used to convert
`Delimiter` values to `TokenKind`.
Despite these additions, it's a net reduction in lines of code. This is
because e.g. `token::OpenParen` is so much shorter than
`token::OpenDelim(Delimiter::Parenthesis)` that many multi-line forms
reduce to single line forms. And many places where the number of lines
doesn't change are still easier to read, just because the names are
shorter, e.g.:
```
- } else if self.token != token::CloseDelim(Delimiter::Brace) {
+ } else if self.token != token::CloseBrace {
```
|
|
|
|
GuillaumeGomez:proc-macro_add_value_retrieval_methods, r=Amanieu"
This reverts commit 08dfbf49e30d917c89e49eb14cb3f1e8b8a1c9ef, reversing
changes made to 10bcdad7df0de3cfb95c7bdb7b16908e73cafc09.
|
|
GuillaumeGomez:proc-macro_add_value_retrieval_methods, r=Amanieu
Add `*_value` methods to proc_macro lib
This is the implementation of https://github.com/rust-lang/libs-team/issues/459.
It allows to get the actual value (unescaped) of the different string literals.
Part of https://github.com/rust-lang/rust/issues/136652.
r? libs-api
|
|
For consistency with `rustc_lexer::TokenKind::Bang`, and because other
`ast::TokenKind` variants generally have syntactic names instead of
semantic names (e.g. `Star` and `DotDot` instead of `Mul` and `Range`).
|
|
`BinOpToken` is badly named, because it only covers the assignable
binary ops and excludes comparisons and `&&`/`||`. Its use in
`ast::TokenKind` does allow a small amount of code sharing, but it's a
clumsy factoring.
This commit removes `ast::TokenKind::BinOp{,Eq}`, replacing each one
with 10 individual variants. This makes `ast::TokenKind` more similar to
`rustc_lexer::TokenKind`, which has individual variants for all
operators.
Although the number of lines of code increases, the number of chars
decreases due to the frequent use of shorter names like `token::Plus`
instead of `token::BinOp(BinOpToken::Plus)`.
|
|
|
|
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
|
|
Lexing precedes parsing, as you'd expect: `Lexer` creates a
`TokenStream` and `Parser` then parses that `TokenStream`.
But, in a horrendous violation of layering abstractions and common
sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err`
method does some error recovery that relies on creating a `Parser` to do
some post-processing of the `TokenStream` that the `Lexer` just created.
This commit just removes `unclosed_delim_err`. This change removes
`Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s
return value can have a more typical form.
The cost is slightly worse error messages in two obscure cases, as shown
in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less
explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff
marker detection is no longer supported (because that detection is
implemented in the parser).
In my opinion this cost is outweighed by the magnitude of the code
cleanup.
|
|
https://github.com/rust-lang/rfcs/blob/master/text/3593-unprefixed-guarded-strings.md
|
|
|
|
fix confusing diagnostic for reserved `##`
Closes #131615
|
|
|
|
- Use iterators instead of `for` loops.
- Use `if`/`else` instead of `match`.
|
|
Must be one of those cases where the function is too long and rustfmt
bails out.
|
|
There is a not-very-useful layering in the lexer, where
`TokenTreesReader` contains a `StringReader`. This commit combines them
and names the result `Lexer`, which is a more obvious name for it.
The methods of `Lexer` are now split across `mod.rs` and `tokentrees.rs`
which isn't ideal, but it doesn't seem worth moving a bunch of code to
avoid it.
|
|
It was added in #123752 to handle some cases involving emoji, but it
isn't necessary because it's always treated the same as
`TokenKind::InvalidIdent`. This commit removes it, which makes things a
little simpler.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
|
|
This paves the way for tracking more state (e.g. error tainting) in the diagnostic context handle
|
|
|
|
Lexing converts source text into a token stream. Parsing converts a
token stream into AST fragments. This commit renames several lexing
operations that have "parse" in the name. I think these names have been
subtly confusing me for years.
This is just a `s/parse/lex/` on function names, with one exception:
`parse_stream_from_source_str` becomes `source_str_to_stream`, to make
it consistent with the existing `source_file_to_stream`. The commit also
moves that function's location in the file to be just above
`source_file_to_stream`.
The commit also cleans up a few comments along the way.
|
|
|
|
|
|
Translation of the lint message happens when the actual diagnostic is
created, not when the lint is buffered. Generating the message from
BuiltinLintDiag ensures that all required data to construct the message
is preserved in the LintBuffer, eventually allowing the messages to be
moved to fluent.
Remove the `msg` field from BufferedEarlyLint, it is either generated
from the data in the BuiltinLintDiag or stored inside
BuiltinLintDiag::Normal.
|
|
|
|
Properly handle emojis as literal prefix in macros
Do not accept the following
```rust
macro_rules! lexes {($($_:tt)*) => {}}
lexes!(🐛"foo");
```
Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro pre-expansion of literal prefixes.
Fix #123696.
|
|
We want to run them on all 64-bit platforms.
|
|
Fix invalid silencing of parsing error
Given
```rust
macro_rules! a {
( ) => {
impl<'b> c for d {
e::<f'g>
}
};
}
```
ensure an error is emitted.
Fix #123079.
|
|
Do not accept the following
```rust
macro_rules! lexes {($($_:tt)*) => {}}
lexes!(🐛"foo");
```
Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro expansion of literal prefixes.
Fix #123696.
|
|
`lexer::UnmatchedDelim` struct in `rustc_parse` is unnecessary public
outside of the crate. This commit reduces the visibility to
`pub(crate)`.
Beside, this removes unnecessary field `expected_delim` that causes
warnings after changing the visibility.
|