about summary refs log tree commit diff
path: root/compiler/rustc_parse/src/parser/mod.rs
AgeCommit message (Collapse)AuthorLines
2025-09-09Rollup merge of #145463 - jieyouxu:error-suffix, r=fmeaseMatthias Krüger-1/+4
Reject invalid literal suffixes in tuple indexing, tuple struct indexing, and struct field name position Tracking issue: rust-lang/rust#60210 Closes rust-lang/rust#60210 ## Summary Bump the ["suffixes on a tuple index are invalid" non-lint pseudo future-incompatibility warning (#60210)][issue-60210][^non-lint] to a **hard error** across all editions, rejecting the remaining carve outs from accidentally accepted invalid suffixes since Rust **1.27**. - We accidentally accepted invalid suffixes in tuple indexing positions in Rust **1.27**. Originally reported at https://github.com/rust-lang/rust/issues/59418. - We tried to hard reject all invalid suffixes in https://github.com/rust-lang/rust/pull/59421, but unfortunately it turns out there were proc macros accidentally relying on it: https://github.com/rust-lang/rust/issues/60138. - We temporarily accepted `{i,u}{32,size}` in https://github.com/rust-lang/rust/pull/60186 (the "*carve outs*") to mitigate *immediate* ecosystem impact, but it came with an FCW warning indicating that we wanted to reject it after a few Rust releases. - Now (1.89.0) is a few Rust releases later (1.35.0), thus I'm proposing to **also reject the carve outs**. - `std::mem::offset_of!` stabilized in Rust **1.77.0** happens to use the same "don't expect suffix" code path which has the carve outs, so it also accepted the carve out suffixes. I'm proposing to **reject this case as well**. ## What specifically breaks? Code that still relied on invalid `{i,u}{32,size}` suffixes being temporarily accepted by rust-lang/rust#60186 as an ecosystem impact mitigation measure (cf. rust-lang/rust#60138). Specifically, the following cases (particularly the construction of these forms in proc macros like reported in rust-lang/rust#60138): ### Position 1: Invalid `{i,u}{32,size}` suffixes in tuple indexing ```rs fn main() { let _x = (42,).0invalid; // Already error, already rejected by #59421 let _x = (42,).0i8; // Already error, not one of the #60186 carve outs. let _x = (42,).0usize; // warning: suffixes on a tuple index are invalid } ``` ### Position 2: Invalid `{i,u}{32,size}` suffixes in tuple struct indexing ```rs fn main() { struct X(i32); let _x = X(42); let _x = _x.0invalid; // Already error, already rejected by #59421 let _x = _x.0i8; // Already error, not one of the #60186 carve outs. let _x = _x.0usize; // warning: suffixes on a tuple index are invalid } ``` ### Position 3: Invalid `{i,u}{32,size}` suffixes in numeric struct field names ```rs fn main() { struct X(i32, i32, i32); let _x = X(1, 2, 3); let _y = X { 0usize: 42, 1: 42, 2: 42 }; // warning: suffixes on a tuple index are invalid match _x { X { 0usize: 1, 1: 2, 2: 3 } => todo!(), // warning: suffixes on a tuple index are invalid _ => {} } } ``` ### Position 4: Invalid `{i,u}{32,size}` suffixes in `std::mem::offset_of!` While investigating the warning, unfortunately I noticed `std::mem::offset_of!` also happens to use the "expect no suffix" code path which had the carve outs. So this was accepted since Rust **1.77.0** with the same FCW: ```rs fn main() { #[repr(C)] pub struct Struct<T>(u8, T); assert_eq!(std::mem::offset_of!(Struct<u32>, 0usize), 0); //~^ WARN suffixes on a tuple index are invalid } ``` ### The above forms in proc macros For instance, constructions like (see tracking issue rust-lang/rust#60210): ```rs let i = 0; quote! { foo.$i } ``` where the user needs to actually write ```rs let i = syn::Index::from(0); quote! { foo.$i } ``` ### Crater results Conducted a crater run (https://github.com/rust-lang/rust/pull/145463#issuecomment-3194920383). - https://github.com/AmlingPalantir/r4/tree/256af3c72f094b298cd442097ef7c571d8001f29: genuine regression; "invalid suffix `usize`" in derive macro. Has a ton of other build warnings, last updated 6 years ago. - Exactly the kind of intended breakage. Minimized down to https://github.com/AmlingPalantir/r4/blob/256af3c72f094b298cd442097ef7c571d8001f29/validates_derive/src/lib.rs#L71-L75, where when interpolation uses `quote`'s `ToTokens` on a `usize` index (i.e. on tuple struct `Tup(())`), the generated suffix becomes `.0usize` (cf. Position 2). - Notified crate author of breakage in https://github.com/AmlingPalantir/r4/issues/1. - Other failures are unrelated or spurious. ## Review remarks - Commits 1-3 expands the test coverage to better reflect the current situation before doing any functional changes. - Commit 4 is an intentional **breaking change**. We bump the non-lint "suffixes on a tuple index are invalid" warning into a hard error. Thus, this will need a crater run and a T-lang FCP. ## Tasks - [x] Run crater to check if anyone is still relying on this being not a hard error. Determine degree of ecosystem breakage. - [x] If degree of breakage seems acceptable, draft nomination report for T-lang for FCP. - [x] Determine hard error on Edition 2024+, or on all editions. ## Accompanying Reference update - https://github.com/rust-lang/reference/pull/1966 [^non-lint]: The FCW was implemented as a *non-lint* warning (meaning it has no associated lint name, and you can't `#![deny(..)]` it) because spans coming from proc macros could not be distinguished from regular field access. This warning was also intentionally impossible to silence. See https://github.com/rust-lang/rust/pull/60186#issuecomment-485581694. [issue-60210]: https://github.com/rust-lang/rust/issues/60210
2025-09-06Make `LetChainsPolicy` public for rustfmt usageMoritz Hedtke-0/+2
2025-08-25Remove the lifetime from `ExpTokenPair`/`SeqSep`.Nicholas Nethercote-31/+31
`TokenKind` now impls `Copy`, so we can store it directly rather than a reference.
2025-08-22Rewrite the new attribute parserJonathan Brouwer-6/+7
2025-08-20Auto merge of #145348 - nnethercote:parse_token_tree-speedup-for-uom, ↵bors-6/+17
r=petrochenkov Sometimes skip over tokens in `parse_token_tree`. r? `@petrochenkov`
2025-08-18Turn invalid index suffixes into hard errorsJieyou Xu-1/+4
2025-08-14Add FnContext in parser for diagnosticxizheyin-1/+1
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
2025-08-14Sometimes skip over tokens in `parse_token_tree`.Nicholas Nethercote-6/+17
This sometimes avoids a lot of `bump` calls.
2025-08-09remove `P`Deadbeef-16/+15
2025-07-13Auto merge of #143461 - folkertdev:cfg-select-builtin-macro, r=petrochenkovbors-1/+5
make `cfg_select` a builtin macro tracking issue: https://github.com/rust-lang/rust/issues/115585 This parses mostly the same as the `macro cfg_select` version, except: 1. wrapping in double brackets is no longer supported (or needed): `cfg_select {{ /* ... */ }}` is now rejected. 2. in an expression context, the rhs is no longer wrapped in a block, so that this now works: ```rust fn main() { println!(cfg_select! { unix => { "foo" } _ => { "bar" } }); } ``` 3. a single wildcard rule is now supported: `cfg_select { _ => 1 }` now works I've also added an error if none of the rules evaluate to true, and warnings for any arms that follow the `_` wildcard rule. cc `@traviscross` if I'm missing any feature that should/should not be included r? `@petrochenkov` for the macro logic details
2025-07-13make `cfg_select` a builtin macroFolkert de Vries-1/+5
2025-07-07const-block-as-pattern: do not refer to no-longer-existing nightly featureRalf Jung-2/+4
2025-06-26Add Ident::is_non_reserved_identMichael Goulet-1/+1
2025-05-27move asm parsing code into `rustc_parse`Folkert de Vries-0/+1
2025-04-30Rollup merge of #140494 - ehuss:document-restrictions, r=traviscross,SparrowLiiMatthias Krüger-0/+49
Parser: Document restrictions I had trouble easily understanding what these various flags do. This is my attempt to try to explain what these do.
2025-04-30ast: Remove token visiting from AST visitorVadim Petrochenkov-4/+0
It's no longer necessary after the removal of nonterminal tokens in #124141.
2025-04-29Parser: Document restrictionsEric Huss-0/+49
I had trouble easily understanding what these various flags do. This is my attempt to try to explain what these do.
2025-04-29Move various token stream things from `rustc_parse` to `rustc_ast`.Nicholas Nethercote-172/+4
Specifically: `TokenCursor`, `TokenTreeCursor`, `LazyAttrTokenStreamImpl`, `FlatToken`, `make_attr_token_stream`, `ParserRange`, `NodeRange`. `ParserReplacement`, and `NodeReplacement`. These are all related to token streams, rather than actual parsing. This will facilitate the simplifications in the next commit.
2025-04-22Move make_unclosed_delims_error to lexer/diagonostics.rsxizheyin-25/+1
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
2025-04-21Remove `token::{Open,Close}Delim`.Nicholas Nethercote-72/+54
By replacing them with `{Open,Close}{Param,Brace,Bracket,Invisible}`. PR #137902 made `ast::TokenKind` more like `lexer::TokenKind` by replacing the compound `BinOp{,Eq}(BinOpToken)` variants with fieldless variants `Plus`, `Minus`, `Star`, etc. This commit does a similar thing with delimiters. It also makes `ast::TokenKind` more similar to `parser::TokenType`. This requires a few new methods: - `TokenKind::is_{,open_,close_}delim()` replace various kinds of pattern matches. - `Delimiter::as_{open,close}_token_kind` are used to convert `Delimiter` values to `TokenKind`. Despite these additions, it's a net reduction in lines of code. This is because e.g. `token::OpenParen` is so much shorter than `token::OpenDelim(Delimiter::Parenthesis)` that many multi-line forms reduce to single line forms. And many places where the number of lines doesn't change are still easier to read, just because the names are shorter, e.g.: ``` - } else if self.token != token::CloseDelim(Delimiter::Brace) { + } else if self.token != token::CloseBrace { ```
2025-04-14Auto merge of #124141 - ↵bors-30/+11
nnethercote:rm-Nonterminal-and-TokenKind-Interpolated, r=petrochenkov Remove `Nonterminal` and `TokenKind::Interpolated` A third attempt at this; the first attempt was #96724 and the second was #114647. r? `@ghost`
2025-04-08Allow for reparsing failure when reparsing a pasted metavar.Nicholas Nethercote-3/+10
Fixes #139445. The additional errors aren't great but the first one is still good and it's the most important, and imperfect errors are better than ICEing.
2025-04-08Allow for missing invisible close delim when reparsing an expression.Nicholas Nethercote-1/+6
This can happen when invalid syntax is passed to a declarative macro. We shouldn't be too strict about the token stream position once the parser has rejected the invalid syntax. Fixes #139248.
2025-04-04Apply `Recovery::Forbidden` when reparsing pasted macro fragments.Nicholas Nethercote-1/+16
Fixes #137874. Removes `tests/crashes/137874.rs`; the new test is simpler (defines its own macro) but tests the same thing. The changes to the output of `tests/ui/associated-consts/issue-93835.rs` partly undo the changes seen when `NtTy` was removed in #133436, which is good.
2025-04-02Impl `Copy` for `Token` and `TokenKind`.Nicholas Nethercote-4/+4
2025-04-02Remove `NtBlock`, `Nonterminal`, and `TokenKind::Interpolated`.Nicholas Nethercote-26/+7
`NtBlock` is the last remaining variant of `Nonterminal`, so once it is gone then `Nonterminal` can be removed as well.
2025-04-02Remove `Token::uninterpolated_span`.Nicholas Nethercote-8/+27
In favour of the similar method on `Parser`, which works on things other than identifiers and lifetimes.
2025-04-02Remove `NtExpr` and `NtLiteral`.Nicholas Nethercote-2/+30
Notes about tests: - tests/ui/rfcs/rfc-2294-if-let-guard/feature-gate.rs: some messages are now duplicated due to repeated parsing. - tests/ui/rfcs/rfc-2497-if-let-chains/disallowed-positions.rs: ditto. - `tests/ui/proc-macro/macro-rules-derive-cfg.rs`: the diff looks large but the only difference is the insertion of a single invisible-delimited group around a metavar. - `tests/ui/attributes/nonterminal-expansion.rs`: a slight span degradation, somehow related to the recent massive attr parsing rewrite (#135726). I couldn't work out exactly what is going wrong, but I don't think it's worth holding things up for a single slightly suboptimal error message.
2025-03-21remove `feature(inline_const_pat)`lcnr-4/+11
2025-03-17If a label is placed on the block of a loop instead of the header, suggest ↵Zachary S-1/+1
moving it to the header.
2025-03-12Auto merge of #138083 - nnethercote:rm-NtItem-NtStmt, r=petrochenkovbors-4/+8
Remove `NtItem` and `NtStmt` Another piece of #124141. r? `@petrochenkov`
2025-03-07Remove `NtItem` and `NtStmt`.Nicholas Nethercote-4/+8
This involves replacing `nt_pretty_printing_compatibility_hack` with `stream_pretty_printing_compatibility_hack`. The handling of statements in `transcribe` is slightly different to other nonterminal kinds, due to the lack of `from_ast` implementation for empty statements. Notable test changes: - `tests/ui/proc-macro/expand-to-derive.rs`: the diff looks large but the only difference is the insertion of a single invisible-delimited group around a metavar.
2025-03-06Parse and allow const use closuresSantiago Pastorino-3/+3
2025-03-03Replace `ast::TokenKind::BinOp{,Eq}` and remove `BinOpToken`.Nicholas Nethercote-4/+4
`BinOpToken` is badly named, because it only covers the assignable binary ops and excludes comparisons and `&&`/`||`. Its use in `ast::TokenKind` does allow a small amount of code sharing, but it's a clumsy factoring. This commit removes `ast::TokenKind::BinOp{,Eq}`, replacing each one with 10 individual variants. This makes `ast::TokenKind` more similar to `rustc_lexer::TokenKind`, which has individual variants for all operators. Although the number of lines of code increases, the number of chars decreases due to the frequent use of shorter names like `token::Plus` instead of `token::BinOp(BinOpToken::Plus)`.
2025-02-28Remove `NtPath`.Nicholas Nethercote-0/+1
2025-02-28Remove `NtMeta`.Nicholas Nethercote-0/+1
Note: there was an existing code path involving `Interpolated` in `MetaItem::from_tokens` that was dead. This commit transfers that to the new form, but puts an `unreachable!` call inside it.
2025-02-28Remove `NtPat`.Nicholas Nethercote-1/+3
The one notable test change is `tests/ui/macros/trace_faulty_macros.rs`. This commit removes the complicated `Interpolated` handling in `expected_expression_found` that results in a longer error message. But I think the new, shorter message is actually an improvement. The original complaint was in #71039, when the error message started with "error: expected expression, found `1 + 1`". That was confusing because `1 + 1` is an expression. Other than that, the reporter said "the whole error message is not too bad if you ignore the first line". Subsequently, extra complexity and wording was added to the error message. But I don't think the extra wording actually helps all that much. In particular, it still says of the `1+1` that "this is expected to be expression". This repeats the problem from the original complaint! This commit removes the extra complexity, reverting to a simpler error message. This is primarily because the traversal is a pain without `Interpolated` tokens. Nonetheless, I think the error message is *improved*. It now starts with "expected expression, found `pat` metavariable", which is much clearer and the real problem. It also doesn't say anything specific about `1+1`, which is good, because the `1+1` isn't really relevant to the error -- it's the `$e:pat` that's important.
2025-02-22Greatly simplify lifetime captures in edition 2024Michael Goulet-1/+1
2025-02-21Avoid snapshotting the parser in `parse_path_inner`.Nicholas Nethercote-4/+3
2025-02-21Remove `NtTy`.Nicholas Nethercote-5/+28
Notes about tests: - tests/ui/parser/macro/trait-object-macro-matcher.rs: the syntax error is duplicated, because it occurs now when parsing the decl macro input, and also when parsing the expanded decl macro. But this won't show up for normal users due to error de-duplication. - tests/ui/associated-consts/issue-93835.rs: similar, plus there are some additional errors about this very broken code. - The changes to metavariable descriptions in #132629 are now visible in error message for several tests.
2025-02-21Remove `NtVis`.Nicholas Nethercote-2/+44
We now use invisible delimiters for expanded `vis` fragments, instead of `Token::Interpolated`.
2025-02-16Fix const items not being allowed to be called `r#move` or `r#static`Noratrieb-3/+3
Because of an ambiguity with const closures, the parser needs to ensure that for a const item, the `const` keyword isn't followed by a `move` or `static` keyword, as that would indicate a const closure: ```rust fn main() { const move // ... } ``` This check did not take raw identifiers into account, therefore being unable to distinguish between `const move` and `const r#move`. The latter is obviously not a const closure, so it should be allowed as a const item. This fixes the check in the parser to only treat `const ...` as a const closure if it's followed by the *proper keyword*, and not a raw identifier. Additionally, this adds a large test that tests for all raw identifiers in all kinds of positions, including `const`, to prevent issues like this one from occurring again.
2025-02-03tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc`Askar Safin-2/+2
2025-01-30Rollup merge of #135882 - hkBst:master, r=estebankMatthias Krüger-4/+2
simplify `similar_tokens` from `Option<Vec<_>>` to `&[_]` All uses immediately invoke contains, so maybe a further simplification is possible.
2025-01-24use `fmt::from_fn` in more places, instead of using structs that impl ↵Yotam Ofek-35/+25
formatting traits
2025-01-23simplify similar_tokens from Option<Vec<_>> to Vec<_>Marijn Schouten-4/+2
2025-01-21Only assert the `Parser` size on specific archesJosh Stone-2/+3
The size of this struct depends on the alignment of `u128`, for example powerpc64le and s390x have align-8 and end up with only 280 bytes. Our 64-bit tier-1 arches are the same though, so let's just assert on those.
2025-01-11Remove allocations from case-insensitive comparison to keywordsMark Rousskov-2/+3
2024-12-19Fix `Parser` size assertion on s390x.Nicholas Nethercote-3/+3
For some reason the memory layout is different on s390x.
2024-12-19Speed up `Parser::expected_token_types`.Nicholas Nethercote-127/+99
The parser pushes a `TokenType` to `Parser::expected_token_types` on every call to the various `check`/`eat` methods, and clears it on every call to `bump`. Some of those `TokenType` values are full tokens that require cloning and dropping. This is a *lot* of work for something that is only used in error messages and it accounts for a significant fraction of parsing execution time. This commit overhauls `TokenType` so that `Parser::expected_token_types` can be implemented as a bitset. This requires changing `TokenType` to a C-style parameterless enum, and adding `TokenTypeSet` which uses a `u128` for the bits. (The new `TokenType` has 105 variants.) The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to the `check`/`eat` methods. This is for maximum speed. The elements in the pairs are always statically known; e.g. a `token::BinOp(token::Star)` is always paired with a `TokenType::Star`. So we now compute `TokenType`s in advance and pass them in to `check`/`eat` rather than the current approach of constructing them on insertion into `expected_token_types`. Values of these pair types can be produced by the new `exp!` macro, which is used at every `check`/`eat` call site. The macro is for convenience, allowing any pair to be generated from a single identifier. The ident/keyword filtering in `expected_one_of_not_found` is no longer necessary. It was there to account for some sloppiness in `TokenKind`/`TokenType` comparisons. The existing `TokenType` is moved to a new file `token_type.rs`, and all its new infrastructure is added to that file. There is more boilerplate code than I would like, but I can't see how to make it shorter.