summary refs log tree commit diff
path: root/compiler/rustc_parse/src/lexer/tokentrees.rs
AgeCommit message (Collapse)AuthorLines
2024-01-11Fix lifetimes in `StringReader`.Nicholas Nethercote-10/+14
Two different lifetimes are conflated. This doesn't matter right now, but needs to be fixed for the next commit to work. And the more descriptive lifetime names make the code easier to read.
2024-01-08Remove `DiagnosticBuilder::delay_as_bug_without_consuming`.Nicholas Nethercote-3/+3
The existing uses are replaced in one of three ways. - In a function that also has calls to `emit`, just rearrange the code so that exactly one of `delay_as_bug` or `emit` is called on every path. - In a function returning a `DiagnosticBuilder`, use `downgrade_to_delayed_bug`. That's good enough because it will get emitted later anyway. - In `unclosed_delim_err`, one set of errors is being replaced with another set, so just cancel the original errors.
2024-01-08Make `DiagnosticBuilder::emit` consuming.Nicholas Nethercote-1/+1
This works for most of its call sites. This is nice, because `emit` very much makes sense as a consuming operation -- indeed, `DiagnosticBuilderState` exists to ensure no diagnostic is emitted twice, but it uses runtime checks. For the small number of call sites where a consuming emit doesn't work, the commit adds `DiagnosticBuilder::emit_without_consuming`. (This will be removed in subsequent commits.) Likewise, `emit_unless` becomes consuming. And `delay_as_bug` becomes consuming, while `delay_as_bug_without_consuming` is added (which will also be removed in subsequent commits.) All this requires significant changes to `DiagnosticBuilder`'s chaining methods. Currently `DiagnosticBuilder` method chaining uses a non-consuming `&mut self -> &mut Self` style, which allows chaining to be used when the chain ends in `emit()`, like so: ``` struct_err(msg).span(span).emit(); ``` But it doesn't work when producing a `DiagnosticBuilder` value, requiring this: ``` let mut err = self.struct_err(msg); err.span(span); err ``` This style of chaining won't work with consuming `emit` though. For that, we need to use to a `self -> Self` style. That also would allow `DiagnosticBuilder` production to be chained, e.g.: ``` self.struct_err(msg).span(span) ``` However, removing the `&mut self -> &mut Self` style would require that individual modifications of a `DiagnosticBuilder` go from this: ``` err.span(span); ``` to this: ``` err = err.span(span); ``` There are *many* such places. I have a high tolerance for tedious refactorings, but even I gave up after a long time trying to convert them all. Instead, this commit has it both ways: the existing `&mut self -> Self` chaining methods are kept, and new `self -> Self` chaining methods are added, all of which have a `_mv` suffix (short for "move"). Changes to the existing `forward!` macro lets this happen with very little additional boilerplate code. I chose to add the suffix to the new chaining methods rather than the existing ones, because the number of changes required is much smaller that way. This doubled chainging is a bit clumsy, but I think it is worthwhile because it allows a *lot* of good things to subsequently happen. In this commit, there are many `mut` qualifiers removed in places where diagnostics are emitted without being modified. In subsequent commits: - chaining can be used more, making the code more concise; - more use of chaining also permits the removal of redundant diagnostic APIs like `struct_err_with_code`, which can be replaced easily with `struct_err` + `code_mv`; - `emit_without_diagnostic` can be removed, which simplifies a lot of machinery, removing the need for `DiagnosticBuilderState`.
2023-12-24Remove `Parser` methods that duplicate `DiagCtxt` methods.Nicholas Nethercote-1/+1
2023-12-18Rename `ParseSess::span_diagnostic` as `ParseSess::dcx`.Nicholas Nethercote-2/+2
2023-12-11Add spacing information to delimiters.Nicholas Nethercote-34/+53
This is an extension of the previous commit. It means the output of something like this: ``` stringify!(let a: Vec<u32> = vec![];) ``` goes from this: ``` let a: Vec<u32> = vec![] ; ``` With this PR, it now produces this string: ``` let a: Vec<u32> = vec![]; ```
2023-12-11Improve `print_tts` by changing `tokenstream::Spacing`.Nicholas Nethercote-2/+7
`tokenstream::Spacing` appears on all `TokenTree::Token` instances, both punct and non-punct. Its current usage: - `Joint` means "can join with the next token *and* that token is a punct". - `Alone` means "cannot join with the next token *or* can join with the next token but that token is not a punct". The fact that `Alone` is used for two different cases is awkward. This commit augments `tokenstream::Spacing` with a new variant `JointHidden`, resulting in: - `Joint` means "can join with the next token *and* that token is a punct". - `JointHidden` means "can join with the next token *and* that token is a not a punct". - `Alone` means "cannot join with the next token". This *drastically* improves the output of `print_tts`. For example, this: ``` stringify!(let a: Vec<u32> = vec![];) ``` currently produces this string: ``` let a : Vec < u32 > = vec! [] ; ``` With this PR, it now produces this string: ``` let a: Vec<u32> = vec![] ; ``` (The space after the `]` is because `TokenTree::Delimited` currently doesn't have spacing information. The subsequent commit fixes this.) The new `print_tts` doesn't replicate original code perfectly. E.g. multiple space characters will be condensed into a single space character. But it's much improved. `print_tts` still produces the old, uglier output for code produced by proc macros. Because we have to translate the generated code from `proc_macro::Spacing` to the more expressive `token::Spacing`, which results in too much `proc_macro::Along` usage and no `proc_macro::JointHidden` usage. So `space_between` still exists and is used by `print_tts` in conjunction with the `Spacing` field. This change will also help with the removal of `Token::Interpolated`. Currently interpolated tokens are pretty-printed nicely via AST pretty printing. `Token::Interpolated` removal will mean they get printed with `print_tts`. Without this change, that would result in much uglier output for code produced by decl macro expansions. With this change, AST pretty printing and `print_tts` produce similar results. The commit also tweaks the comments on `proc_macro::Spacing`. In particular, it refers to "compound tokens" rather than "multi-char operators" because lifetimes aren't operators.
2023-11-21Fix `clippy::needless_borrow` in the compilerNilstrieb-3/+3
`x clippy compiler -Aclippy::all -Wclippy::needless_borrow --fix`. Then I had to remove a few unnecessary parens and muts that were exposed now.
2023-11-11Move unclosed delim errors to separate functionsjwang05-53/+58
2023-11-10Correctly handle while-let-chainssjwang05-1/+1
2023-11-09Catch an edge casesjwang05-1/+5
2023-11-09Catch stray { in let-chainssjwang05-1/+33
2023-10-30When encountering unclosed delimiters during parsing, check for diff markersEsteban Küber-18/+46
Fix #116252.
2023-10-12Reorder an expression to improve readability.Nicholas Nethercote-12/+7
2023-10-12Rename `Token::is_op` as `Token::is_punct`.Nicholas Nethercote-2/+5
For consistency with `proc_macro::Punct`.
2023-07-30inline format!() args up to and including rustc_middleMatthias Krüger-1/+1
2023-05-03Restrict `From<S>` for `{D,Subd}iagnosticMessage`.Nicholas Nethercote-2/+1
Currently a `{D,Subd}iagnosticMessage` can be created from any type that impls `Into<String>`. That includes `&str`, `String`, and `Cow<'static, str>`, which are reasonable. It also includes `&String`, which is pretty weird, and results in many places making unnecessary allocations for patterns like this: ``` self.fatal(&format!(...)) ``` This creates a string with `format!`, takes a reference, passes the reference to `fatal`, which does an `into()`, which clones the reference, doing a second allocation. Two allocations for a single string, bleh. This commit changes the `From` impls so that you can only create a `{D,Subd}iagnosticMessage` from `&str`, `String`, or `Cow<'static, str>`. This requires changing all the places that currently create one from a `&String`. Most of these are of the `&format!(...)` form described above; each one removes an unnecessary static `&`, plus an allocation when executed. There are also a few places where the existing use of `&String` was more reasonable; these now just use `clone()` at the call site. As well as making the code nicer and more efficient, this is a step towards possibly using `Cow<'static, str>` in `{D,Subd}iagnosticMessage::{Str,Eager}`. That would require changing the `From<&'a str>` impls to `From<&'static str>`, which is doable, but I'm not yet sure if it's worthwhile.
2023-04-10Fix typos in compilerDaniPopes-2/+2
2023-02-28refactor parse_token_trees to not return unmatched_delimsyukang-1/+0
2023-02-28rename unmatched_braces to unmatched_delimsyukang-5/+6
2023-02-28remove duplicated diagnostic for unclosed delimiteryukang-8/+9
2023-01-27Improve unexpected close and mismatch delimiter hint in TokenTreesReaderyukang-87/+42
2022-10-03Invert `is_top_level` to avoid negation.Nicholas Nethercote-5/+5
2022-10-03Remove `TokenStreamBuilder`.Nicholas Nethercote-37/+20
It's now only used in one function. Also, the "should we glue the tokens?" check is only necessary when pushing a `TokenTree::Token`, not when pushing a `TokenTree::Delimited`. As part of this, we now do the "should we glue the tokens?" check immediately, which avoids having look back at the previous token. It also puts all the logic dealing with token gluing in a single place.
2022-10-03Inline and remove `parse_token_tree_non_delim_non_eof`.Nicholas Nethercote-16/+14
It has a single call site.
2022-10-03Merge `parse_token_trees_until_close_delim` and `parse_all_token_trees`.Nicholas Nethercote-23/+16
Because they're very similar, and this will allow some follow-up changes.
2022-09-28Address review comments.Nicholas Nethercote-7/+7
2022-09-27Rename some variables.Nicholas Nethercote-10/+10
These make the delimiter processing clearer.
2022-09-27Minor improvements.Nicholas Nethercote-3/+5
Add some comments, and mark one path as unreachable.
2022-09-26[ui] Rearrange `StringReader`/`TokenTreesReader` creation.Nicholas Nethercote-19/+18
`TokenTreesReader` wraps a `StringReader`, but the `into_token_trees` function obscures this. This commit moves to a more straightforward control flow.
2022-09-26Remove `ast::Token::take`.Nicholas Nethercote-2/+1
Instead of replacing `TokenTreesReader::token` in two steps, we can just do it in one, which is both simpler and faster.
2022-09-26Remove `TokenTreesReader::bump`.Nicholas Nethercote-17/+9
It's an unnecessary layer that obfuscates when I am looking for optimizations.
2022-09-26Clarify spacing computation.Nicholas Nethercote-6/+12
The spacing computation is done in two parts. In the first part `next_token` and `bump` use `Spacing::Alone` to mean "preceded by whitespace" and `Spacing::Joint` to mean the opposite. In the second part `parse_token_tree_other` then adjusts the `spacing` value to mean the usual thing (i.e. "is the following token joinable punctuation?"). This shift in meaning is very confusing and it took me some time to understand what was going on. This commit changes the first part to use a bool, and adds some comments, which makes things much clearer.
2022-09-26Rearrange `TokenTreesReader::parse_token_tree`.Nicholas Nethercote-178/+168
`parse_token_tree` is basically a match with four arms: `Eof`, `OpenDelim`, `CloseDelim`, and "other". It has two call sites, and at each call site one of the arms is unreachable. It's also not inlined. This commit removes `parse_token_tree` by splitting it into four functions and inlining them. This avoids some repeated conditional tests and also some non-inlined function calls on the hot path.
2022-08-01Inline `TokenStreamBuilder::push`.Nicholas Nethercote-0/+1
Because it's small and hot.
2022-08-01Avoid an unnecessary `return`.Nicholas Nethercote-2/+2
2022-07-29Remove `TreeAndSpacing`.Nicholas Nethercote-16/+12
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is not quite right. `Spacing` makes sense for `TokenTree::Token`, but does not make sense for `TokenTree::Delimited`, because a `TokenTree::Delimited` cannot be joined with another `TokenTree`. This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`, changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the `TreeAndSpacing` typedef. The commit removes these two impls: - `impl From<TokenTree> for TokenStream` - `impl From<TokenTree> for TreeAndSpacing` These were useful, but also resulted in code with many `.into()` calls that was hard to read, particularly for anyone not highly familiar with the relevant types. This commit makes some other changes to compensate: - `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`. - `TokenStream::token_{alone,joint}()` are added. - `TokenStream::delimited` is added. This results in things like this: ```rust TokenTree::token(token::Semi, stmt.span).into() ``` changing to this: ```rust TokenStream::token_alone(token::Semi, stmt.span) ``` This makes the type of the result, and its spacing, clearer. These changes also simplifies `Cursor` and `CursorRef`, because they no longer need to distinguish between `next` and `next_with_spacing`.
2022-04-28rustc_ast: Harmonize delimiter naming with `proc_macro::Delimiter`Vadim Petrochenkov-6/+6
2022-02-284 - Make more use of `let_chains`Caio-8/+7
Continuation of #94376. cc #53667
2020-09-20use if let instead of single match arm expressions to compact code and ↵Matthias Krüger-6/+3
reduce nesting (clippy::single_match)
2020-09-03Rename IsJoint -> SpacingAleksey Kladov-11/+11
To match better naming from proc-macro
2020-09-03Condense StringReader's API to a single functionAleksey Kladov-1/+1
2020-09-01Simplify TokenTreesReaderAleksey Kladov-11/+11
This `joint_to_prev` bit of state is no longer needed.
2020-09-01Don't emit trivia tokensAleksey Kladov-13/+3
2020-08-30mv compiler to compiler/mark-0/+313