summary refs log tree commit diff
path: root/compiler/rustc_parse/src/parser/mod.rs
AgeCommit message (Collapse)AuthorLines
2024-08-16Overhaul token collection.Nicholas Nethercote-9/+7
This commit does the following. - Renames `collect_tokens_trailing_token` as `collect_tokens`, because (a) it's annoying long, and (b) the `_trailing_token` bit is less accurate now that its types have changed. - In `collect_tokens`, adds a `Option<CollectPos>` argument and a `UsePreAttrPos` in the return type of `f`. These are used in `parse_expr_force_collect` (for vanilla expressions) and in `parse_stmt_without_recovery` (for two different cases of expression statements). Together these ensure are enough to fix all the problems with token collection and assoc expressions. The changes to the `stringify.rs` test demonstrate some of these. - Adds a new test. The code in this test was causing an assertion failure prior to this commit, due to an invalid `NodeRange`. The extra complexity is annoying, but necessary to fix the existing problems.
2024-08-16Add an assertion to `NodeRange::new`.Nicholas Nethercote-0/+1
2024-08-16Convert a bool to `Trailing`.Nicholas Nethercote-1/+7
This pre-existing type is suitable for use with the return value of the `f` parameter in `collect_tokens_trailing_token`. The more descriptive name will be useful because the next commit will add another boolean value to the return value of `f`.
2024-08-14Use `impl PartialEq<TokenKind> for Token` more.Nicholas Nethercote-4/+4
This lets us compare a `Token` with a `TokenKind`. It's used a lot, but can be used even more, avoiding the need for some `.kind` uses.
2024-08-12Rollup merge of #128994 - nnethercote:fix-Parser-look_ahead-more, ↵Guillaume Gomez-4/+6
r=compiler-errors Fix bug in `Parser::look_ahead`. The special case was failing to handle invisible delimiters on one path. Fixes (but doesn't close until beta backported) #128895. r? `@davidtwco`
2024-08-12Fix bug in `Parser::look_ahead`.Nicholas Nethercote-4/+6
The special case was failing to handle invisible delimiters on one path. Fixes #128895.
2024-08-11Use assert_matches around the compilerMichael Goulet-1/+2
2024-08-03Rollup merge of #127921 - spastorino:stabilize-unsafe-extern-blocks, ↵Matthias Krüger-3/+0
r=compiler-errors Stabilize unsafe extern blocks (RFC 3484) # Stabilization report ## Summary This is a tracking issue for the RFC 3484: Unsafe Extern Blocks We are stabilizing `#![feature(unsafe_extern_blocks)]`, as described in [Unsafe Extern Blocks RFC 3484](https://github.com/rust-lang/rfcs/pull/3484). This feature makes explicit that declaring an extern block is unsafe. Starting in Rust 2024, all extern blocks must be marked as unsafe. In all editions, items within unsafe extern blocks may be marked as safe to use. RFC: https://github.com/rust-lang/rfcs/pull/3484 Tracking issue: #123743 ## What is stabilized ### Summary of stabilization We now need extern blocks to be marked as unsafe and items inside can also have safety modifiers (unsafe or safe), by default items with no modifiers are unsafe to offer easy migration without surprising results. ```rust unsafe extern { // sqrt (from libm) may be called with any `f64` pub safe fn sqrt(x: f64) -> f64; // strlen (from libc) requires a valid pointer, // so we mark it as being an unsafe fn pub unsafe fn strlen(p: *const c_char) -> usize; // this function doesn't say safe or unsafe, so it defaults to unsafe pub fn free(p: *mut core::ffi::c_void); pub safe static IMPORTANT_BYTES: [u8; 256]; pub safe static LINES: SyncUnsafeCell<i32>; } ``` ## Tests The relevant tests are in `tests/ui/rust-2024/unsafe-extern-blocks`. ## History - https://github.com/rust-lang/rust/pull/124482 - https://github.com/rust-lang/rust/pull/124455 - https://github.com/rust-lang/rust/pull/125077 - https://github.com/rust-lang/rust/pull/125522 - https://github.com/rust-lang/rust/issues/126738 - https://github.com/rust-lang/rust/issues/126749 - https://github.com/rust-lang/rust/issues/126755 - https://github.com/rust-lang/rust/pull/126757 - https://github.com/rust-lang/rust/pull/126758 - https://github.com/rust-lang/rust/issues/126756 - https://github.com/rust-lang/rust/pull/126973 - https://github.com/rust-lang/rust/pull/127535 - https://github.com/rust-lang/rustfmt/pull/6204 ## Unresolved questions I am not aware of any unresolved questions.
2024-08-03Rollup merge of #128483 - nnethercote:still-more-cfg-cleanups, r=petrochenkovMatthias Krüger-20/+50
Still more `cfg` cleanups Found while looking closely at `cfg`/`cfg_attr` processing code. r? `````````@petrochenkov`````````
2024-08-01Distinguish the two kinds of token range.Nicholas Nethercote-20/+50
When collecting tokens there are two kinds of range: - a range relative to the parser's full token stream (which we get when we are parsing); - a range relative to a single AST node's token stream (which we use within `LazyAttrTokenStreamImpl` when replacing tokens). These are currently both represented with `Range<u32>` and it's easy to mix them up -- until now I hadn't properly understood the difference. This commit introduces `ParserRange` and `NodeRange` to distinguish them. This also requires splitting `ReplaceRange` in two, giving the new types `ParserReplacement` and `NodeReplacement`. (These latter two names reduce the overloading of the word "range".) The commit also rewrites some comments to be clearer. The end result is a little more verbose, but much clearer.
2024-07-29Mark Parser::eat/check methods as must_useMichael Goulet-4/+15
2024-07-29Reformat `use` declarations.Nicholas Nethercote-6/+7
The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.
2024-07-25improve error message when `global_asm!` uses `asm!` optionsFolkert-2/+5
2024-07-23Stabilize unsafe extern blocks (RFC 3484)Santiago Pastorino-3/+0
2024-07-19Auto merge of #127957 - matthiaskrgr:rollup-1u5ivck, r=matthiaskrgrbors-5/+5
Rollup of 6 pull requests Successful merges: - #127350 (Parser: Suggest Placing the Return Type After Function Parameters) - #127621 (Rewrite and rename `issue-22131` and `issue-26006` `run-make` tests to rmake) - #127662 (When finding item gated behind a `cfg` flag, point at it) - #127903 (`force_collect` improvements) - #127932 (rustdoc: fix `current` class on sidebar modnav) - #127943 (Don't allow unsafe statics outside of extern blocks) r? `@ghost` `@rustbot` modify labels: rollup
2024-07-19Rollup merge of #127903 - nnethercote:force_collect-improvements, r=petrochenkovMatthias Krüger-5/+4
`force_collect` improvements Yet more cleanups relating to `cfg_attr` processing. r? ````@petrochenkov````
2024-07-19Rollup merge of #127350 - veera-sivarajan:bugfix-126311, r=lcnrMatthias Krüger-0/+1
Parser: Suggest Placing the Return Type After Function Parameters Fixes #126311 This PR suggests placing the return type after the function parameters when it's misplaced after a `where` clause. This also tangentially improves diagnostics for cases like [this](https://github.com/veera-sivarajan/rust/blob/86d6f1312a77997ef994240e716288d61a343a6d/tests/ui/parser/issues/misplaced-return-type-without-where-issue-126311.rs#L1C1-L1C28) and adds doc comments for `parser::AllowPlus`.
2024-07-19Overhaul comments in `collect_tokens_trailing_token`.Nicholas Nethercote-0/+1
Adding details, clarifying lots of little things, etc. In particular, the commit adds details of an example. I find this very helpful, because it's taken me a long time to understand how this code works.
2024-07-19Make `Parser::num_bump_calls` 0-indexed.Nicholas Nethercote-0/+5
Currently in `collect_tokens_trailing_token`, `start_pos` and `end_pos` are 1-indexed by `replace_ranges` is 0-indexed, which is really confusing. Making them both 0-indexed makes debugging much easier.
2024-07-19Simplify `CaptureState::inner_attr_ranges`.Nicholas Nethercote-1/+1
The `Option`s within the `ReplaceRange`s within the hashmap are always `None`. This PR omits them and inserts them when they are extracted from the hashmap.
2024-07-19Remove an unnecessary `ForceCollect::Yes`.Nicholas Nethercote-5/+4
No need to collect tokens on this recovery path, because the parsed statement isn't even looked at.
2024-07-18Parser: Suggest Placing the Return Type After Function ParametersVeera-0/+1
2024-07-18Remove `TrailingToken`.Nicholas Nethercote-11/+1
It's used in `Parser::collect_tokens_trailing_token` to decide whether to capture a trailing token. But the callers actually know whether to capture a trailing token, so it's simpler for them to just pass in a bool. Also, the `TrailingToken::Gt` case was weird, because it didn't result in a trailing token being captured. It could have been subsumed by the `TrailingToken::MaybeComma` case, and it effectively is in the new code.
2024-07-16Remove references to `maybe_whole_expr`.Nicholas Nethercote-1/+0
It was removed in #126571.
2024-07-14Rollup merge of #127273 - nnethercote:fix-DebugParser, r=workingjubileeMatthias Krüger-8/+10
Fix `DebugParser`. I tried using this and it didn't work at all. `prev_token` is never eof, so the accumulator is always false, which means the `then_some` always returns `None`, which means `scan` always returns `None`, and `tokens` always ends up an empty vec. I'm not sure how this code was supposed to work. (An aside: I find `Iterator::scan` to be a pretty wretched function, that produces code which is very hard to understand. Probably why this is just one of two uses of it in the entire compiler.) This commit changes it to a simpler imperative style that produces a valid `tokens` vec. r? `@workingjubilee`
2024-07-13Rollup merge of #127558 - nnethercote:more-Attribute-cleanups, r=petrochenkovJubilee-1/+1
More attribute cleanups A follow-up to #127308. r? ```@petrochenkov```
2024-07-13Fix `DebugParser`.Nicholas Nethercote-8/+10
It currently doesn't work at all. This commit changes it to a simpler imperative style that produces a valid `tokens` vec. (An aside: I find `Iterator::scan` to be a pretty wretched function, that produces code which is very hard to understand. Probably why this is just one of two uses of it in the entire compiler.)
2024-07-12Add a new special case to `Parser::look_ahead`.Nicholas Nethercote-0/+29
This new special case is simpler than the old special case because it only is used when `dist == 1`. But that's still enough to cover ~98% of cases. This results in equivalent performance to the old special case, and identical behaviour as the general case.
2024-07-12Remove the bogus special case from `Parser::look_ahead`.Nicholas Nethercote-35/+2
The general case at the bottom of `look_ahead` is slow, because it clones the token cursor. Above it there is a special case for performance that is hit most of the time and avoids the cloning. Unfortunately, its behaviour differs from the general case in two ways. - When within a pair of delimiters, if you look any distance past the closing delimiter you get the closing delimiter instead of what comes after the closing delimiter. - It uses `tree_cursor.look_ahead(dist - 1)` which totally confuses tokens with token trees. This means that only the first token in a token tree will be seen. E.g. in a sequence like `{ a }` the `a` and `}` will be skipped over. Bad! It's likely that these differences weren't noticed before now because the use of `look_ahead` in the parser is limited to small distances and relatively few contexts. Removing the special case causes slowdowns up of to 2% on a range of benchmarks. The next commit will add a new, correct special case to regain that lost performance.
2024-07-09Move `Spacing` into `FlatToken`.Nicholas Nethercote-1/+1
It's only needed for the `FlatToken::Token` variant. This makes things a little more concise.
2024-07-07Rename some attribute types for consistency.Nicholas Nethercote-9/+8
- `AttributesData` -> `AttrsTarget` - `AttrTokenTree::Attributes` -> `AttrTokenTree::AttrsTarget` - `FlatToken::AttrTarget` -> `FlatToken::AttrsTarget`
2024-07-07Simplify `ReplaceRange`.Nicholas Nethercote-2/+2
Currently the second element is a `Vec<(FlatToken, Spacing)>`. But the vector always has zero or one elements, and the `FlatToken` is always `FlatToken::AttrTarget` (which contains an `AttributesData`), and the spacing is always `Alone`. So we can simplify it to `Option<AttributesData>`. An assertion in `to_attr_token_stream` can can also be removed, because `new_tokens.len()` was always 0 or 1, which means than `range.len()` is always greater than or equal to it, because `range.is_empty()` is always false (as per the earlier assertion).
2024-07-02Shrink parser positions from `usize` to `u32`.Nicholas Nethercote-3/+3
The number of source code bytes can't exceed a `u32`'s range, so a token position also can't. This reduces the size of `Parser` and `LazyAttrTokenStreamImpl` by eight bytes each.
2024-06-20Properly gate `safe` keyword in pre-expansionMichael Goulet-0/+3
2024-06-19Auto merge of #126678 - nnethercote:fix-duplicated-attrs-on-nt-expr, ↵bors-11/+0
r=petrochenkov Fix duplicated attributes on nonterminal expressions This PR fixes a long-standing bug (#86055) whereby expression attributes can be duplicated when expanded through declarative macros. First, consider how items are parsed in declarative macros: ``` Items: - parse_nonterminal - parse_item(ForceCollect::Yes) - parse_item_ - attrs = parse_outer_attributes - parse_item_common(attrs) - maybe_whole! - collect_tokens_trailing_token ``` The important thing is that the parsing of outer attributes is outside token collection, so the item's tokens don't include the attributes. This is how it's supposed to be. Now consider how expression are parsed in declarative macros: ``` Exprs: - parse_nonterminal - parse_expr_force_collect - collect_tokens_no_attrs - collect_tokens_trailing_token - parse_expr - parse_expr_res(None) - parse_expr_assoc_with - parse_expr_prefix - parse_or_use_outer_attributes - parse_expr_dot_or_call ``` The important thing is that the parsing of outer attributes is inside token collection, so the the expr's tokens do include the attributes, i.e. in `AttributesData::tokens`. This PR fixes the bug by rearranging expression parsing to that outer attribute parsing happens outside of token collection. This requires a number of small refactorings because expression parsing is somewhat complicated. While doing so the PR makes the code a bit cleaner and simpler, by eliminating `parse_or_use_outer_attributes` and `Option<AttrWrapper>` arguments (in favour of the simpler `parse_outer_attributes` and `AttrWrapper` arguments), and simplifying `LhsExpr`. r? `@petrochenkov`
2024-06-19Refactor `parse_expr_res`.Nicholas Nethercote-11/+0
This removes the final `Option<AttrWrapper>` argument.
2024-06-18Prefer `dcx` methods over fields or fields' methodsOli Scherer-1/+1
2024-06-17Make parse_seq_to_before_tokens take expected/nonexpected tokens, use in ↵Michael Goulet-23/+16
parse_precise_capturing_syntax
2024-06-07Rollup merge of #126052 - nnethercote:rustc_parse-more-cleanups, r=spastorinoMatthias Krüger-21/+22
More `rustc_parse` cleanups Following on from #125815. r? `@spastorino`
2024-06-07Revert "Create const block DefIds in typeck instead of ast lowering"Oli Scherer-5/+8
This reverts commit ddc5f9b6c1f21da5d4596bf7980185a00984ac42.
2024-06-06Reduce `pub` exposure.Nicholas Nethercote-21/+22
2024-06-04Handle safety keyword for extern block inner itemsSantiago Pastorino-0/+2
2024-05-28Create const block DefIds in typeck instead of ast loweringOli Scherer-8/+5
2024-05-17Rename Unsafe to SafetySantiago Pastorino-5/+5
2024-05-14Remove `NtIdent` and `NtLifetime`.Nicholas Nethercote-0/+6
The extra span is now recorded in the new `TokenKind::NtIdent` and `TokenKind::NtLifetime`. These both consist of a single token, and so there's no operator precedence problems with inserting them directly into the token stream. The other way to do this would be to wrap the ident/lifetime in invisible delimiters, but there's a lot of code that assumes an interpolated ident/lifetime fits in a single token, and changing all that code to work with invisible delimiters would have been a pain. (Maybe it could be done in a follow-up.) This change might not seem like much of a win, but it's a first step toward the much bigger and long-desired removal of `Nonterminal` and `TokenKind::Interpolated`. That change is big and complex enough that it's worth doing this piece separately. (Indeed, this commit is based on part of a late commit in #114647, a prior attempt at that big and complex change.)
2024-05-13Remove a `Span` from `TokenKind::Interpolated`.Nicholas Nethercote-20/+8
This span records the declaration of the metavariable in the LHS of the macro. It's used in a couple of error messages. Unfortunately, it gets in the way of the long-term goal of removing `TokenKind::Interpolated`. So this commit removes it, which degrades a couple of (obscure) error messages but makes things simpler and enables the next commit.
2024-05-09Add `ErrorGuaranteed` to `Recovered::Yes` and use it more.Nicholas Nethercote-18/+4
The starting point for this was identical comments on two different fields, in `ast::VariantData::Struct` and `hir::VariantData::Struct`: ``` // FIXME: investigate making this a `Option<ErrorGuaranteed>` recovered: bool ``` I tried that, and then found that I needed to add an `ErrorGuaranteed` to `Recovered::Yes`. Then I ended up using `Recovered` instead of `Option<ErrorGuaranteed>` for these two places and elsewhere, which required moving `ErrorGuaranteed` from `rustc_parse` to `rustc_ast`. This makes things more consistent, because `Recovered` is used in more places, and there are fewer uses of `bool` and `Option<ErrorGuaranteed>`. And safer, because it's difficult/impossible to set `recovered` to `Recovered::Yes` without having emitted an error.
2024-05-08Auto merge of #124779 - workingjubilee:debug-formatting-my-beloved, ↵bors-9/+53
r=compiler-errors Improve `rustc_parse::Parser`'s debuggability The main event is the final commit where I add `Parser::debug_lookahead`. Everything else was basically cleaning up things that bugged me (debugging, as it were) until I felt comfortable enough to actually work on it. The motivation is that it's annoying as hell to try to figure out how the debug infra works in rustc without having basic queries like `debug!(?parser);` come up "empty". However, Parser has a lot of fields that are mostly irrelevant for most debugging, like the entire ParseSess. I think `Parser::debug_lookahead` with a capped lookahead might be fine as a general-purpose Debug impl, but this adapter version was suggested to allow more choice, and admittedly, it's a refined version of what I was already handrolling just to get some insight going.
2024-05-07compiler: add `Parser::debug_lookahead`Jubilee Young-0/+41
I tried debugging a parser-related issue but found it annoying to not be able to easily peek into the Parser's token stream. Add a convenience fn that offers an opinionated view into the parser, but one that is useful for answering basic questions about parser state.
2024-05-07compiler: derive Debug in parserJubilee Young-8/+11
It's annoying to debug the parser if you have to stop every five seconds to add a Debug impl.