about summary refs log tree commit diff
path: root/compiler/rustc_parse/src/parser/mod.rs
AgeCommit message (Collapse)AuthorLines
2024-07-12Remove the bogus special case from `Parser::look_ahead`.Nicholas Nethercote-35/+2
The general case at the bottom of `look_ahead` is slow, because it clones the token cursor. Above it there is a special case for performance that is hit most of the time and avoids the cloning. Unfortunately, its behaviour differs from the general case in two ways. - When within a pair of delimiters, if you look any distance past the closing delimiter you get the closing delimiter instead of what comes after the closing delimiter. - It uses `tree_cursor.look_ahead(dist - 1)` which totally confuses tokens with token trees. This means that only the first token in a token tree will be seen. E.g. in a sequence like `{ a }` the `a` and `}` will be skipped over. Bad! It's likely that these differences weren't noticed before now because the use of `look_ahead` in the parser is limited to small distances and relatively few contexts. Removing the special case causes slowdowns up of to 2% on a range of benchmarks. The next commit will add a new, correct special case to regain that lost performance.
2024-07-09Move `Spacing` into `FlatToken`.Nicholas Nethercote-1/+1
It's only needed for the `FlatToken::Token` variant. This makes things a little more concise.
2024-07-07Rename some attribute types for consistency.Nicholas Nethercote-9/+8
- `AttributesData` -> `AttrsTarget` - `AttrTokenTree::Attributes` -> `AttrTokenTree::AttrsTarget` - `FlatToken::AttrTarget` -> `FlatToken::AttrsTarget`
2024-07-07Simplify `ReplaceRange`.Nicholas Nethercote-2/+2
Currently the second element is a `Vec<(FlatToken, Spacing)>`. But the vector always has zero or one elements, and the `FlatToken` is always `FlatToken::AttrTarget` (which contains an `AttributesData`), and the spacing is always `Alone`. So we can simplify it to `Option<AttributesData>`. An assertion in `to_attr_token_stream` can can also be removed, because `new_tokens.len()` was always 0 or 1, which means than `range.len()` is always greater than or equal to it, because `range.is_empty()` is always false (as per the earlier assertion).
2024-07-02Shrink parser positions from `usize` to `u32`.Nicholas Nethercote-3/+3
The number of source code bytes can't exceed a `u32`'s range, so a token position also can't. This reduces the size of `Parser` and `LazyAttrTokenStreamImpl` by eight bytes each.
2024-06-20Properly gate `safe` keyword in pre-expansionMichael Goulet-0/+3
2024-06-19Auto merge of #126678 - nnethercote:fix-duplicated-attrs-on-nt-expr, ↵bors-11/+0
r=petrochenkov Fix duplicated attributes on nonterminal expressions This PR fixes a long-standing bug (#86055) whereby expression attributes can be duplicated when expanded through declarative macros. First, consider how items are parsed in declarative macros: ``` Items: - parse_nonterminal - parse_item(ForceCollect::Yes) - parse_item_ - attrs = parse_outer_attributes - parse_item_common(attrs) - maybe_whole! - collect_tokens_trailing_token ``` The important thing is that the parsing of outer attributes is outside token collection, so the item's tokens don't include the attributes. This is how it's supposed to be. Now consider how expression are parsed in declarative macros: ``` Exprs: - parse_nonterminal - parse_expr_force_collect - collect_tokens_no_attrs - collect_tokens_trailing_token - parse_expr - parse_expr_res(None) - parse_expr_assoc_with - parse_expr_prefix - parse_or_use_outer_attributes - parse_expr_dot_or_call ``` The important thing is that the parsing of outer attributes is inside token collection, so the the expr's tokens do include the attributes, i.e. in `AttributesData::tokens`. This PR fixes the bug by rearranging expression parsing to that outer attribute parsing happens outside of token collection. This requires a number of small refactorings because expression parsing is somewhat complicated. While doing so the PR makes the code a bit cleaner and simpler, by eliminating `parse_or_use_outer_attributes` and `Option<AttrWrapper>` arguments (in favour of the simpler `parse_outer_attributes` and `AttrWrapper` arguments), and simplifying `LhsExpr`. r? `@petrochenkov`
2024-06-19Refactor `parse_expr_res`.Nicholas Nethercote-11/+0
This removes the final `Option<AttrWrapper>` argument.
2024-06-18Prefer `dcx` methods over fields or fields' methodsOli Scherer-1/+1
2024-06-17Make parse_seq_to_before_tokens take expected/nonexpected tokens, use in ↵Michael Goulet-23/+16
parse_precise_capturing_syntax
2024-06-07Rollup merge of #126052 - nnethercote:rustc_parse-more-cleanups, r=spastorinoMatthias Krüger-21/+22
More `rustc_parse` cleanups Following on from #125815. r? `@spastorino`
2024-06-07Revert "Create const block DefIds in typeck instead of ast lowering"Oli Scherer-5/+8
This reverts commit ddc5f9b6c1f21da5d4596bf7980185a00984ac42.
2024-06-06Reduce `pub` exposure.Nicholas Nethercote-21/+22
2024-06-04Handle safety keyword for extern block inner itemsSantiago Pastorino-0/+2
2024-05-28Create const block DefIds in typeck instead of ast loweringOli Scherer-8/+5
2024-05-17Rename Unsafe to SafetySantiago Pastorino-5/+5
2024-05-14Remove `NtIdent` and `NtLifetime`.Nicholas Nethercote-0/+6
The extra span is now recorded in the new `TokenKind::NtIdent` and `TokenKind::NtLifetime`. These both consist of a single token, and so there's no operator precedence problems with inserting them directly into the token stream. The other way to do this would be to wrap the ident/lifetime in invisible delimiters, but there's a lot of code that assumes an interpolated ident/lifetime fits in a single token, and changing all that code to work with invisible delimiters would have been a pain. (Maybe it could be done in a follow-up.) This change might not seem like much of a win, but it's a first step toward the much bigger and long-desired removal of `Nonterminal` and `TokenKind::Interpolated`. That change is big and complex enough that it's worth doing this piece separately. (Indeed, this commit is based on part of a late commit in #114647, a prior attempt at that big and complex change.)
2024-05-13Remove a `Span` from `TokenKind::Interpolated`.Nicholas Nethercote-20/+8
This span records the declaration of the metavariable in the LHS of the macro. It's used in a couple of error messages. Unfortunately, it gets in the way of the long-term goal of removing `TokenKind::Interpolated`. So this commit removes it, which degrades a couple of (obscure) error messages but makes things simpler and enables the next commit.
2024-05-09Add `ErrorGuaranteed` to `Recovered::Yes` and use it more.Nicholas Nethercote-18/+4
The starting point for this was identical comments on two different fields, in `ast::VariantData::Struct` and `hir::VariantData::Struct`: ``` // FIXME: investigate making this a `Option<ErrorGuaranteed>` recovered: bool ``` I tried that, and then found that I needed to add an `ErrorGuaranteed` to `Recovered::Yes`. Then I ended up using `Recovered` instead of `Option<ErrorGuaranteed>` for these two places and elsewhere, which required moving `ErrorGuaranteed` from `rustc_parse` to `rustc_ast`. This makes things more consistent, because `Recovered` is used in more places, and there are fewer uses of `bool` and `Option<ErrorGuaranteed>`. And safer, because it's difficult/impossible to set `recovered` to `Recovered::Yes` without having emitted an error.
2024-05-08Auto merge of #124779 - workingjubilee:debug-formatting-my-beloved, ↵bors-9/+53
r=compiler-errors Improve `rustc_parse::Parser`'s debuggability The main event is the final commit where I add `Parser::debug_lookahead`. Everything else was basically cleaning up things that bugged me (debugging, as it were) until I felt comfortable enough to actually work on it. The motivation is that it's annoying as hell to try to figure out how the debug infra works in rustc without having basic queries like `debug!(?parser);` come up "empty". However, Parser has a lot of fields that are mostly irrelevant for most debugging, like the entire ParseSess. I think `Parser::debug_lookahead` with a capped lookahead might be fine as a general-purpose Debug impl, but this adapter version was suggested to allow more choice, and admittedly, it's a refined version of what I was already handrolling just to get some insight going.
2024-05-07compiler: add `Parser::debug_lookahead`Jubilee Young-0/+41
I tried debugging a parser-related issue but found it annoying to not be able to easily peek into the Parser's token stream. Add a convenience fn that offers an opinionated view into the parser, but one that is useful for answering basic questions about parser state.
2024-05-07compiler: derive Debug in parserJubilee Young-8/+11
It's annoying to debug the parser if you have to stop every five seconds to add a Debug impl.
2024-05-05compiler: Privatize `Parser::current_closure`Jubilee Young-1/+1
This was added as pub in 2021 and remains only privately used in 2024!
2024-05-06Move some tests from `rustc_expand` to `rustc_parse`.Nicholas Nethercote-0/+14
There are some test cases involving `parse` and `tokenstream` and `mut_visit` that are located in `rustc_expand`. Because it used to be the case that constructing a `ParseSess` required the involvement of `rustc_expand`. However, since #64197 merged (a long time ago) `rust_expand` no longer needs to be involved. This commit moves the tests into `rustc_parse`. This is the optimal place for the `parse` tests. It's not ideal for the `tokenstream` and `mut_visit` tests -- they would be better in `rustc_ast` -- but they still rely on parsing, which is not available in `rustc_ast`. But `rustc_parse` is lower down in the crate graph and closer to `rustc_ast` than `rust_expand`, so it's still an improvement for them. The exact renaming is as follows: - rustc_expand/src/mut_visit/tests.rs -> rustc_parse/src/parser/mut_visit/tests.rs - rustc_expand/src/tokenstream/tests.rs -> rustc_parse/src/parser/tokenstream/tests.rs - rustc_expand/src/tests.rs + rustc_expand/src/parse/tests.rs -> compiler/rustc_parse/src/parser/tests.rs The latter two test files are combined because there's no need for them to be separate, and having a `rustc_parse::parser::parse` module would be weird. This also means some `pub(crate)`s can be removed.
2024-04-24Stabilise `inline_const`Gary Guo-2/+0
2024-04-23Rollup merge of #124169 - compiler-errors:parser-fatal, r=oli-obkMatthias Krüger-0/+1
Don't fatal when calling `expect_one_of` when recovering arg in `parse_seq` In `parse_seq`, when parsing a sequence of token-separated items, if we don't see a separator, we try to parse another item eagerly in order to give a good diagnostic and recover from a missing separator: https://github.com/rust-lang/rust/blob/d1a0fa5ed3ffe52d72f761d3c95cbeb0a9cdfe66/compiler/rustc_parse/src/parser/mod.rs#L900-L901 If parsing the item itself calls `expect_one_of`, then we will fatal because of #58903: https://github.com/rust-lang/rust/blob/d1a0fa5ed3ffe52d72f761d3c95cbeb0a9cdfe66/compiler/rustc_parse/src/parser/mod.rs#L513-L516 For `precise_capturing` feature I implemented, we do end up calling `expected_one_of`: https://github.com/rust-lang/rust/blob/d1a0fa5ed3ffe52d72f761d3c95cbeb0a9cdfe66/compiler/rustc_parse/src/parser/ty.rs#L712-L714 This leads the compiler to fatal *before* having emitted the first error, leading to absolutely no useful information for the user about what happened in the parser. This PR makes it so that we stop doing that. Fixes #124195
2024-04-23parser: remove ununsed(no reads) max_angle_bracket_count fieldklensy-3/+0
2024-04-19Don't fatal when calling expect_one_of when recovering arg in parse_seqMichael Goulet-0/+1
2024-04-18Simplify `static_assert_size`s.Nicholas Nethercote-1/+1
We want to run them on all 64-bit platforms.
2024-04-16Rollup merge of #123462 - fmease:rn-mod-sep-to-path-sep, r=nnethercoteLeón Orell Valerian Liehr-2/+2
Cleanup: Rename `ModSep` to `PathSep` `::` is usually referred to as the *path separator* (citation needed). The existing name `ModSep` for *module separator* is a bit misleading since it in fact separates the segments of arbitrary path segments, not only ones resolving to modules. Let me just give a shout-out to associated items (`T::Assoc`, `<Ty as Trait>::function`) and enum variants (`Option::None`). Motivation: Reduce friction for new contributors, prevent potential confusion. cc `@petrochenkov` r? nnethercote or compiler
2024-04-04Rename ModSep to PathSepLeón Orell Valerian Liehr-2/+2
2024-04-03Check `x86_64` size assertions on `aarch64`, tooZalathar-1/+1
This makes it easier for contributors on aarch64 workstations (e.g. Macs) to notice when these assertions have been violated.
2024-03-27Implement `mut ref`/`mut ref mut`Jules Bertholet-6/+10
2024-03-21Rollup merge of #122752 - nnethercote:Interpolated-cleanups, r=petrochenkovMatthias Krüger-11/+25
Interpolated cleanups Various cleanups I made while working on attempts to remove `Interpolated`, that are worth merging now. Best reviewed one commit at a time. r? `@petrochenkov`
2024-03-21Auto merge of #122718 - workingjubilee:eyeliner-for-contrast, r=lcnrbors-0/+13
Inline a bunch of trivial conditions in parser It is often the case that these small, conditional functions, when inlined, reveal notable optimization opportunities to LLVM. While saethlin has done a lot of good work on making these kinds of small functions not need `#[inline]` tags as much, being clearer about what we want inlined will get both the MIR opts and LLVM to pursue it more aggressively. On local perf runs, this seems fruitful. Let's see what rust-timer says. r? `@ghost`
2024-03-21Streamline `NamedMatch`.Nicholas Nethercote-4/+17
This commit combines `MatchedTokenTree` and `MatchedNonterminal`, which are often considered together, into a single `MatchedSingle`. It shares a representation with the newly-parameterized `ParseNtResult`. This will also make things much simpler if/when variants from `Interpolated` start being moved to `ParseNtResult`.
2024-03-21Use better variable names in some `maybe_whole!` calls.Nicholas Nethercote-1/+1
2024-03-21Use `maybe_whole!` to streamline `parse_item_common`.Nicholas Nethercote-6/+7
This requires changing `maybe_whole!` so it allows the value to be modified.
2024-03-19Inline conditionals in the parserJubilee Young-0/+13
There are a bunch of small helper conditionals we use. Inline them to get slightly better perf in a few cases, especially when rustc is compiled without PGO.
2024-03-15Make `unexpected` always "return" `PResult<()>` & add `unexpected_any`Maybe Waffle-2/+12
This prevents breakage when `?` no longer skews inference.
2024-03-05Rename all `ParseSess` variables/fields/lifetimes as `psess`.Nicholas Nethercote-9/+9
Existing names for values of this type are `sess`, `parse_sess`, `parse_session`, and `ps`. `sess` is particularly annoying because that's also used for `Session` values, which are often co-located, and it can be difficult to know which type a value named `sess` refers to. (That annoyance is the main motivation for this change.) `psess` is nice and short, which is good for a name used this much. The commit also renames some `parse_sess_created` values as `psess_created`.
2024-02-28Rename `DiagnosticBuilder` as `Diag`.Nicholas Nethercote-3/+3
Much better! Note that this involves renaming (and updating the value of) `DIAGNOSTIC_BUILDER` in clippy.
2024-02-20Add newtype for trailing in parserclubby789-8/+14
2024-02-20Add newtype for parser recoveryclubby789-12/+25
2024-02-20Add newtype for raw identsclubby789-4/+5
2024-02-15Add `ErrorGuaranteed` to `ast::LitKind::Err`, `token::LitKind::Err`.Nicholas Nethercote-1/+1
This mostly works well, and eliminates a couple of delayed bugs. One annoying thing is that we should really also add an `ErrorGuaranteed` to `proc_macro::bridge::LitKind::Err`. But that's difficult because `proc_macro` doesn't have access to `ErrorGuaranteed`, so we have to fake it.
2024-01-28Handle methodcalls & operators in patternsLieselotte-0/+1
2024-01-10Rename consuming chaining methods on `DiagnosticBuilder`.Nicholas Nethercote-2/+2
In #119606 I added them and used a `_mv` suffix, but that wasn't great. A `with_` prefix has three different existing uses. - Constructors, e.g. `Vec::with_capacity`. - Wrappers that provide an environment to execute some code, e.g. `with_session_globals`. - Consuming chaining methods, e.g. `Span::with_{lo,hi,ctxt}`. The third case is exactly what we want, so this commit changes `DiagnosticBuilder::foo_mv` to `DiagnosticBuilder::with_foo`. Thanks to @compiler-errors for the suggestion.
2024-01-08Make `DiagnosticBuilder::emit` consuming.Nicholas Nethercote-3/+3
This works for most of its call sites. This is nice, because `emit` very much makes sense as a consuming operation -- indeed, `DiagnosticBuilderState` exists to ensure no diagnostic is emitted twice, but it uses runtime checks. For the small number of call sites where a consuming emit doesn't work, the commit adds `DiagnosticBuilder::emit_without_consuming`. (This will be removed in subsequent commits.) Likewise, `emit_unless` becomes consuming. And `delay_as_bug` becomes consuming, while `delay_as_bug_without_consuming` is added (which will also be removed in subsequent commits.) All this requires significant changes to `DiagnosticBuilder`'s chaining methods. Currently `DiagnosticBuilder` method chaining uses a non-consuming `&mut self -> &mut Self` style, which allows chaining to be used when the chain ends in `emit()`, like so: ``` struct_err(msg).span(span).emit(); ``` But it doesn't work when producing a `DiagnosticBuilder` value, requiring this: ``` let mut err = self.struct_err(msg); err.span(span); err ``` This style of chaining won't work with consuming `emit` though. For that, we need to use to a `self -> Self` style. That also would allow `DiagnosticBuilder` production to be chained, e.g.: ``` self.struct_err(msg).span(span) ``` However, removing the `&mut self -> &mut Self` style would require that individual modifications of a `DiagnosticBuilder` go from this: ``` err.span(span); ``` to this: ``` err = err.span(span); ``` There are *many* such places. I have a high tolerance for tedious refactorings, but even I gave up after a long time trying to convert them all. Instead, this commit has it both ways: the existing `&mut self -> Self` chaining methods are kept, and new `self -> Self` chaining methods are added, all of which have a `_mv` suffix (short for "move"). Changes to the existing `forward!` macro lets this happen with very little additional boilerplate code. I chose to add the suffix to the new chaining methods rather than the existing ones, because the number of changes required is much smaller that way. This doubled chainging is a bit clumsy, but I think it is worthwhile because it allows a *lot* of good things to subsequently happen. In this commit, there are many `mut` qualifiers removed in places where diagnostics are emitted without being modified. In subsequent commits: - chaining can be used more, making the code more concise; - more use of chaining also permits the removal of redundant diagnostic APIs like `struct_err_with_code`, which can be replaced easily with `struct_err` + `code_mv`; - `emit_without_diagnostic` can be removed, which simplifies a lot of machinery, removing the need for `DiagnosticBuilderState`.
2024-01-03Rename some `Diagnostic` setters.Nicholas Nethercote-4/+3
`Diagnostic` has 40 methods that return `&mut Self` and could be considered setters. Four of them have a `set_` prefix. This doesn't seem necessary for a type that implements the builder pattern. This commit removes the `set_` prefixes on those four methods.