about summary refs log tree commit diff
path: root/compiler/rustc_parse/src
AgeCommit message (Collapse)AuthorLines
2025-01-30Rollup merge of #135882 - hkBst:master, r=estebankMatthias Krüger-7/+4
simplify `similar_tokens` from `Option<Vec<_>>` to `&[_]` All uses immediately invoke contains, so maybe a further simplification is possible.
2025-01-28Rollup merge of #133151 - tyrone-wu:trim-fn-ptr-whitespace, r=compiler-errorsMatthias Krüger-6/+50
Trim extra whitespace in fn ptr suggestion span Trim extra whitespace when suggesting removal of invalid qualifiers when parsing function pointer type. Fixes: #133083 --- I made a comment about the format of the diagnostic error message in https://github.com/rust-lang/rust/issues/133083#issuecomment-2480047875. I think the `.label` may be a little redundant if the diagnostic only highlights the bad qualifier instead of the entire `TyKind::BareFn` span. If it makes sense, I can include it in this PR.
2025-01-27Trim extra whitespace in fn ptr suggestion spanTyrone Wu-6/+50
Trim extra whitespace when suggesting removal of invalid qualifiers when parsing function pointer type. Fixes: #133083 Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
2025-01-27Use identifiers in diagnostics more oftenMichael Goulet-2/+2
2025-01-24use `fmt::from_fn` in more places, instead of using structs that impl ↵Yotam Ofek-35/+25
formatting traits
2025-01-24Rollup merge of #135855 - cuviper:parser-size, r=wesleywiserMatthias Krüger-2/+3
Only assert the `Parser` size on specific arches The size of this struct depends on the alignment of `u128`, for example powerpc64le and s390x have align-8 and end up with only 280 bytes. Our 64-bit tier-1 arches are the same though, so let's just assert on those. r? nnethercote
2025-01-23simplify similar_tokens from Vec<_> to &[_]Marijn Schouten-2/+2
2025-01-23simplify similar_tokens from Option<Vec<_>> to Vec<_>Marijn Schouten-6/+3
2025-01-22Rollup merge of #135557 - estebank:wtf8, r=fee1-deadMatthias Krüger-4/+63
Point at invalid utf-8 span on user's source code ``` error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8 --> $DIR/not-utf8-2.rs:6:5 | LL | include!("not-utf8-bin-file.rs"); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | note: byte `193` is not valid utf-8 --> $DIR/not-utf8-bin-file.rs:2:14 | LL | let _ = "�|�␂!5�cc␕␂��"; | ^ = note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info) ``` When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character. Fix #76869.
2025-01-22Auto merge of #134478 - compiler-errors:attr-span, r=oli-obkbors-6/+2
Properly record metavar spans for other expansions other than TT This properly records metavar spans for nonterminals other than tokentree. This means that we operations like `span.to(other_span)` work correctly for macros. As you can see, other diagnostics involving metavars have improved as a result. Fixes #132908 Alternative to #133270 cc `@ehuss` cc `@petrochenkov`
2025-01-21Only assert the `Parser` size on specific archesJosh Stone-2/+3
The size of this struct depends on the alignment of `u128`, for example powerpc64le and s390x have align-8 and end up with only 280 bytes. Our 64-bit tier-1 arches are the same though, so let's just assert on those.
2025-01-22Point at invalid utf-8 span on user's source codeEsteban Küber-4/+63
``` error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8 --> $DIR/not-utf8-2.rs:6:5 | LL | include!("not-utf8-bin-file.rs"); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | note: `[193]` is not valid utf-8 --> $DIR/not-utf8-bin-file.rs:2:14 | LL | let _ = "�|�␂!5�cc␕␂��"; | ^ = note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info) ``` When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character. Fix #76869.
2025-01-19Run `clippy --fix` for `unnecessary_map_or` lintYotam Ofek-1/+1
2025-01-11Remove allocations from case-insensitive comparison to keywordsMark Rousskov-2/+3
2025-01-09Auto merge of #135268 - pietroalbini:pa-bump-stage0, r=Mark-Simulacrumbors-2/+2
Master bootstrap update Part of the release process. r? `@Mark-Simulacrum`
2025-01-09Rollup merge of #135269 - estebank:unneeded-into, r=compiler-errorsMatthias Krüger-4/+2
Remove some unnecessary `.into()` calls
2025-01-08Remove some unnecessary `.into()` callsEsteban Küber-4/+2
2025-01-08update cfg(bootstrap)Pietro Albini-2/+2
2025-01-08Rename PatKind::Lit to ExprOli Scherer-6/+6
2024-12-29Rollup merge of #134884 - calciumbe:patch1, r=jieyouxuMatthias Krüger-1/+1
Fix typos Hello, I fix some typos in docs and comments. Thank you very much.
2024-12-29fix: typoscalciumbe-1/+1
Signed-off-by: calciumbe <192480234+calciumbe@users.noreply.github.com>
2024-12-27Skip parenthesis around tuple struct field callsDavid Tolnay-1/+1
2024-12-21Properly record metavar spans for other expansions other than TTMichael Goulet-6/+2
2024-12-21Rollup merge of #134600 - dtolnay:chainedcomparison, r=oli-obkJacob Pratt-10/+2
Fix parenthesization of chained comparisons by pretty-printer Example: ```rust macro_rules! repro { () => { 1 < 2 }; } fn main() { let _ = repro!() == false; } ``` Previously `-Zunpretty=expanded` would pretty-print this syntactically invalid output: `fn main() { let _ = 1 < 2 == false; }` ```console error: comparison operators cannot be chained --> <anon>:8:23 | 8 | fn main() { let _ = 1 < 2 == false; } | ^ ^^ | help: parenthesize the comparison | 8 | fn main() { let _ = (1 < 2) == false; } | + + ``` With the fix, it will print `fn main() { let _ = (1 < 2) == false; }`. Making `-Zunpretty=expanded` consistently produce syntactically valid Rust output is important because that is what makes it possible for `cargo expand` to format and perform filtering on the expanded code. ## Review notes According to `rg '\.fixity\(\)' compiler/` the `fixity` function is called only 3 places: - https://github.com/rust-lang/rust/blob/13170cd787cb733ed24842ee825bcbd98dc01476/compiler/rustc_ast_pretty/src/pprust/state/expr.rs#L283-L287 - https://github.com/rust-lang/rust/blob/13170cd787cb733ed24842ee825bcbd98dc01476/compiler/rustc_hir_pretty/src/lib.rs#L1295-L1299 - https://github.com/rust-lang/rust/blob/13170cd787cb733ed24842ee825bcbd98dc01476/compiler/rustc_parse/src/parser/expr.rs#L282-L289 The 2 pretty printers definitely want to treat comparisons using `Fixity::None`. That's the whole bug being fixed. Meanwhile, the parser's `Fixity::None` codepath is previously unreachable as indicated by the comment, so as long as `Fixity::None` here behaves exactly the way that `Fixity::Left` used to behave, you can tell that this PR definitely does not constitute any behavior change for the parser. My guess for why comparison operators were set to `Fixity::Left` instead of `Fixity::None` is that it's a very old workaround for giving a good chained comparisons diagnostic (like what I pasted above). Nowadays that is handled by a different dedicated codepath.
2024-12-21Rollup merge of #133087 - estebank:stmt-misparse, r=chenyukangJacob Pratt-2/+53
Detect missing `.` in method chain in `let` bindings and statements On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access. ``` error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map` --> $DIR/missing-dot-on-statement-expression.rs:7:29 | LL | let _ = [1, 2, 3].iter()map(|x| x); | ^^^ expected one of `.`, `;`, `?`, `else`, or an operator | help: you might have meant to write a method call | LL | let _ = [1, 2, 3].iter().map(|x| x); | + ```
2024-12-20Change comparison operators to have Fixity::NoneDavid Tolnay-10/+2
2024-12-21Do not suggest `foo.Bar`Esteban Küber-1/+6
2024-12-21Account for missing `.` in macros to avoid incorrect suggestionEsteban Küber-0/+4
2024-12-21Detect missing `.` in method chain in let bindings and statementsEsteban Küber-2/+44
On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access. ``` error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map` --> $DIR/missing-dot-on-statement-expression.rs:7:29 | LL | let _ = [1, 2, 3].iter()map(|x| x); | ^^^ expected one of `.`, `;`, `?`, `else`, or an operator | help: you might have meant to write a method call | LL | let _ = [1, 2, 3].iter().map(|x| x); | + ```
2024-12-20Reduce the amount of explicit FatalError.raise()bjorn3-14/+6
Instead use dcx.abort_if_error() or guar.raise_fatal() instead. These guarantee that an error actually happened previously and thus we don't silently abort.
2024-12-19Fix `Parser` size assertion on s390x.Nicholas Nethercote-3/+3
For some reason the memory layout is different on s390x.
2024-12-19Make `TokenType::from_u32` foolproof.Nicholas Nethercote-115/+126
Currently it relies on having the right integer for every variant, and if you add a variant you need to adjust the integers for all subsequent variants, which is a pain. This commit introduces a match guard formulation that takes advantage of the enum-to-integer conversion to avoid specifying the integer for each variant. And it does this via a macro to avoid lots of boilerplate.
2024-12-19Speed up `Parser::expected_token_types`.Nicholas Nethercote-704/+1263
The parser pushes a `TokenType` to `Parser::expected_token_types` on every call to the various `check`/`eat` methods, and clears it on every call to `bump`. Some of those `TokenType` values are full tokens that require cloning and dropping. This is a *lot* of work for something that is only used in error messages and it accounts for a significant fraction of parsing execution time. This commit overhauls `TokenType` so that `Parser::expected_token_types` can be implemented as a bitset. This requires changing `TokenType` to a C-style parameterless enum, and adding `TokenTypeSet` which uses a `u128` for the bits. (The new `TokenType` has 105 variants.) The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to the `check`/`eat` methods. This is for maximum speed. The elements in the pairs are always statically known; e.g. a `token::BinOp(token::Star)` is always paired with a `TokenType::Star`. So we now compute `TokenType`s in advance and pass them in to `check`/`eat` rather than the current approach of constructing them on insertion into `expected_token_types`. Values of these pair types can be produced by the new `exp!` macro, which is used at every `check`/`eat` call site. The macro is for convenience, allowing any pair to be generated from a single identifier. The ident/keyword filtering in `expected_one_of_not_found` is no longer necessary. It was there to account for some sloppiness in `TokenKind`/`TokenType` comparisons. The existing `TokenType` is moved to a new file `token_type.rs`, and all its new infrastructure is added to that file. There is more boilerplate code than I would like, but I can't see how to make it shorter.
2024-12-19Remove `bra`/`ket` naming.Nicholas Nethercote-24/+24
This is a naming convention used in a handful of spots in the parser for delimiters. It confused me when I first saw it a long time ago, and I've never liked it. A web search says "Bra-ket notation" exists in linear algebra but the terminology has zero prior use in a programming context, as far as I can tell. This commit changes it to `open`/`close`, which is consistent with the rest of the compiler.
2024-12-19Tweak some parser `check`/`eat` methods.Nicholas Nethercote-25/+20
The most significant is `check_keyword`: it now only pushes to `expected_token_types` if the keyword check fails, which matches how all the other `check` methods work. The remainder are just tweaks to make these methods more consistent with each other.
2024-12-19Rename `Parser::expected_tokens` as `Parser::expected_token_types`.Nicholas Nethercote-24/+25
Because the `Token` type is similar to but different to the `TokenType` type, and the difference is important, so we want to avoid confusion.
2024-12-18Auto merge of #134443 - joshtriplett:use-field-init-shorthand, ↵bors-1/+1
r=lqd,tgross35,nnethercote Use field init shorthand where possible Field init shorthand allows writing initializers like `tcx: tcx` as `tcx`. The compiler already uses it extensively. Fix the last few places where it isn't yet used. EDIT: this PR also updates `rustfmt.toml` to set `use_field_init_shorthand = true`.
2024-12-18Rollup merge of #134253 - nnethercote:overhaul-keywords, r=petrochenkov许杰友 Jieyou Xu (Joe)-4/+4
Overhaul keyword handling The compiler's list of keywords has some problems. - It contains several items that aren't keywords. - The order isn't quite right in a couple of places. - Some of the names of predicates relating to keywords are confusing. - rustdoc and rustfmt have their own (incorrect) versions of the keyword list. - `AllKeywords` is unnecessarily complex. r? ```@jieyouxu```
2024-12-18Rollup merge of #134161 - nnethercote:overhaul-token-cursors, r=spastorino许杰友 Jieyou Xu (Joe)-42/+72
Overhaul token cursors Some nice cleanups here. r? `````@davidtwco`````
2024-12-18Only have one source of truth for keywords.Nicholas Nethercote-2/+2
`rustc_symbol` is the source of truth for keywords. rustdoc has its own implicit definition of keywords, via the `is_doc_keyword`. It (presumably) intends to include all keywords, but it omits `yeet`. rustfmt has its own explicit list of Rust keywords. It also (presumably) intends to include all keywords, but it omits `await`, `builtin`, `gen`, `macro_rules`, `raw`, `reuse`, `safe`, and `yeet`. Also, it does linear searches through this list, which is inefficient. This commit fixes all of the above problems by introducing a new predicate `is_any_keyword` in rustc and using it in rustdoc and rustfmt. It documents that it's not the right predicate in most cases.
2024-12-18Simplify `AllKeywords`.Nicholas Nethercote-2/+2
It's a verbose reinvention of a range type, and can be cut down a lot.
2024-12-18Re-export more `rustc_span::symbol` things from `rustc_span`.Nicholas Nethercote-31/+19
`rustc_span::symbol` defines some things that are re-exported from `rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some closely related things such as `Ident` and `kw`. So you can do `use rustc_span::{Symbol, sym}` but you have to do `use rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good reason. This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`, and changes many `rustc_span::symbol::` qualifiers in `compiler/` to `rustc_span::`. This is a 200+ net line of code reduction, mostly because many files with two `use rustc_span` items can be reduced to one.
2024-12-18Overhaul `TokenTreeCursor`.Nicholas Nethercote-33/+63
- Move it to `rustc_parse`, which is the only crate that uses it. This lets us remove all the `pub` markers from it. - Change `next_ref` and `look_ahead` to `get` and `bump`, which work better for the `rustc_parse` uses. - This requires adding a `TokenStream::get` method, which is simple. - In `TokenCursor`, we currently duplicate the `DelimSpan`/`DelimSpacing`/`Delimiter` from the surrounding `TokenTree::Delimited` in the stack. This isn't necessary so long as we don't prematurely move past the `Delimited`, and is a small perf win on a very hot code path. - In `parse_token_tree`, we clone the relevant `TokenTree::Delimited` instead of constructing an identical one from pieces.
2024-12-18Rename `RefTokenTreeCursor`.Nicholas Nethercote-9/+9
Because `TokenStreamIter` is a much better name for a `TokenStream` iterator. Also rename the `TokenStream::trees` method as `TokenStream::iter`, and some local variables.
2024-12-17Use field init shorthand where possibleJosh Triplett-1/+1
Field init shorthand allows writing initializers like `tcx: tcx` as `tcx`. The compiler already uses it extensively. Fix the last few places where it isn't yet used.
2024-12-16Rollup merge of #134284 - estebank:issue-74863, r=lcnrMatthias Krüger-2/+2
Keep track of patterns that could have introduced a binding, but didn't When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error. ``` error[E0425]: cannot find value `title` in this scope --> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:18:30 | LL | if let Website { url, .. } = website { | ------------------- this pattern doesn't include `title`, which is available in `Website` LL | println!("[{}]({})", title, url); | ^^^^^ not found in this scope ``` Fix #74863.
2024-12-15Add hir::AttributeJonathan Dönszelmann-11/+8
2024-12-15Rename `value` field to `expr` to simplify later commits' diffsOli Scherer-5/+3
2024-12-14Rollup merge of #134192 - nnethercote:rm-Lexer-Parser-dep, r=compiler-errorsMatthias Krüger-97/+37
Remove `Lexer`'s dependency on `Parser`. Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`. But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created. This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form. The cost is slightly worse error messages in two obscure cases, as shown in these tests: - tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`. - tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser). In my opinion this cost is outweighed by the magnitude of the code cleanup. r? ```````@chenyukang```````
2024-12-13Keep track of patterns that could have introduced a binding, but didn'tEsteban Küber-2/+2
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error. ``` error[E0425]: cannot find value `title` in this scope --> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:19:30 | LL | println!("[{}]({})", title, url); | ^^^^^ not found in this scope | note: `Website` has a field `title` which could have been included in this pattern, but it wasn't --> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:17:12 | LL | / struct Website { LL | | url: String, LL | | title: Option<String> , | | ----- defined here LL | | } | |_- ... LL | if let Website { url, .. } = website { | ^^^^^^^^^^^^^^^^^^^ this pattern doesn't include `title`, which is available in `Website` ``` Fix #74863.