about summary refs log tree commit diff
path: root/compiler/rustc_parse/src/parser/mod.rs
AgeCommit message (Collapse)AuthorLines
2021-07-14Suggest a path separator if a stray colon is found in a match armFabian Wolff-1/+1
Co-authored-by: Esteban Kuber <estebank@users.noreply.github.com>
2021-06-06parser: Ensure that all nonterminals have tokens after parsingVadim Petrochenkov-0/+1
2021-05-18Stabilize extended_key_value_attributesJoshua Nelson-13/+0
# Stabilization report ## Summary This stabilizes using macro expansion in key-value attributes, like so: ```rust #[doc = include_str!("my_doc.md")] struct S; #[path = concat!(env!("OUT_DIR"), "/generated.rs")] mod m; ``` See the changes to the reference for details on what macros are allowed; see Petrochenkov's excellent blog post [on internals](https://internals.rust-lang.org/t/macro-expansion-points-in-attributes/11455) for alternatives that were considered and rejected ("why accept no more and no less?") This has been available on nightly since 1.50 with no major issues. ## Notes ### Accepted syntax The parser accepts arbitrary Rust expressions in this position, but any expression other than a macro invocation will ultimately lead to an error because it is not expected by the built-in expression forms (e.g., `#[doc]`). Note that decorators and the like may be able to observe other expression forms. ### Expansion ordering Expansion of macro expressions in "inert" attributes occurs after decorators have executed, analogously to macro expressions appearing in the function body or other parts of decorator input. There is currently no way for decorators to accept macros in key-value position if macro expansion must be performed before the decorator executes (if the macro can simply be copied into the output for later expansion, that can work). ## Test cases - https://github.com/rust-lang/rust/blob/master/src/test/ui/attributes/key-value-expansion-on-mac.rs - https://github.com/rust-lang/rust/blob/master/src/test/rustdoc/external-doc.rs The feature has also been dogfooded extensively in the compiler and standard library: - https://github.com/rust-lang/rust/pull/83329 - https://github.com/rust-lang/rust/pull/83230 - https://github.com/rust-lang/rust/pull/82641 - https://github.com/rust-lang/rust/pull/80534 ## Implementation history - Initial proposal: https://github.com/rust-lang/rust/issues/55414#issuecomment-554005412 - Experiment to see how much code it would break: https://github.com/rust-lang/rust/pull/67121 - Preliminary work to restrict expansion that would conflict with this feature: https://github.com/rust-lang/rust/pull/77271 - Initial implementation: https://github.com/rust-lang/rust/pull/78837 - Fix for an ICE: https://github.com/rust-lang/rust/pull/80563 ## Unresolved Questions ~~https://github.com/rust-lang/rust/pull/83366#issuecomment-805180738 listed some concerns, but they have been resolved as of this final report.~~ ## Additional Information There are two workarounds that have a similar effect for `#[doc]` attributes on nightly. One is to emulate this behavior by using a limited version of this feature that was stabilized for historical reasons: ```rust macro_rules! forward_inner_docs { ($e:expr => $i:item) => { #[doc = $e] $i }; } forward_inner_docs!(include_str!("lib.rs") => struct S {}); ``` This also works for other attributes (like `#[path = concat!(...)]`). The other is to use `doc(include)`: ```rust #![feature(external_doc)] #[doc(include = "lib.rs")] struct S {} ``` The first works, but is non-trivial for people to discover, and difficult to read and maintain. The second is a strange special-case for a particular use of the macro. This generalizes it to work for any use case, not just including files. I plan to remove `doc(include)` when this is stabilized. The `forward_inner_docs` workaround will still compile without warnings, but I expect it to be used less once it's no longer necessary.
2021-05-10Auto merge of #85104 - hi-rustin:rustin-patch-typo, r=jonas-schievinkbors-1/+1
Fix typo
2021-05-09Fix typohi-rustin-1/+1
2021-05-08Rename `Parser::span_fatal_err` -> `Parser::span_err`Joshua Nelson-1/+1
The name was misleading, it wasn't actually a fatal error.
2021-05-07Improve diagnostics for functions in `struct` definitionsLeSeulArtichaut-18/+18
2021-04-12Add fast path when None delimiters are not involvedAaron Hill-0/+19
2021-04-12Fix lookahead with None-delimited groupAaron Hill-9/+13
2021-04-11Implement token-based handling of attributes during expansionAaron Hill-40/+107
This PR modifies the macro expansion infrastructure to handle attributes in a fully token-based manner. As a result: * Derives macros no longer lose spans when their input is modified by eager cfg-expansion. This is accomplished by performing eager cfg-expansion on the token stream that we pass to the derive proc-macro * Inner attributes now preserve spans in all cases, including when we have multiple inner attributes in a row. This is accomplished through the following changes: * New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced. These are very similar to a normal `TokenTree`, but they also track the position of attributes and attribute targets within the stream. They are built when we collect tokens during parsing. An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when we invoke a macro. * Token capturing and `LazyTokenStream` are modified to work with `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which is created during the parsing of a nested AST node to make the 'outer' AST node aware of the attributes and attribute target stored deeper in the token stream. * When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
2021-04-09Avoid `;` -> `,` recovery and unclosed `}` recovery from being too verboseEsteban Küber-1/+3
Those two recovery attempts have a very bad interaction that causes too unnecessary output. Add a simple gate to avoid interpreting a `;` as a `,` when there are unclosed braces.
2021-03-26Always preserve `None`-delimited groups in a captured `TokenStream`Aaron Hill-5/+20
Previously, we would silently remove any `None`-delimiters when capturing a `TokenStream`, 'flattenting' them to their inner tokens. This was not normally visible, since we usually have `TokenKind::Interpolated` (which gets converted to a `None`-delimited group during macro invocation) instead of an actual `None`-delimited group. However, there are a couple of cases where this becomes visible to proc-macros: 1. A cross-crate `macro_rules!` macro has a `None`-delimited group stored in its body (as a result of being produced by another `macro_rules!` macro). The cross-crate `macro_rules!` invocation can then expand to an attribute macro invocation, which needs to be able to see the `None`-delimited group. 2. A proc-macro can invoke an attribute proc-macro with its re-collected input. If there are any nonterminals present in the input, they will get re-collected to `None`-delimited groups, which will then get captured as part of the attribute macro invocation. Both of these cases are incredibly obscure, so there hopefully won't be any breakage. This change will allow more agressive 'flattenting' of nonterminals in #82608 without losing `None`-delimited groups.
2021-03-25Avoid double-collection for expression nonterminalsAaron Hill-1/+1
2021-03-19stabilize or_patternsmark-1/+1
2021-02-27Combine HasAttrs and HasTokens into AstLikeAaron Hill-2/+2
When token-based attribute handling is implemeneted in #80689, we will need to access tokens from `HasAttrs` (to perform cfg-stripping), and we will to access attributes from `HasTokens` (to construct a `PreexpTokenStream`). This PR merges the `HasAttrs` and `HasTokens` traits into a new `AstLike` trait. The previous `HasAttrs` impls from `Vec<Attribute>` and `AttrVec` are removed - they aren't attribute targets, so the impls never really made sense.
2021-02-23Rollup merge of #81235 - reese:rw-tuple-diagnostics, r=estebankDylan DPC-1/+1
Improve suggestion for tuple struct pattern matching errors. Closes #80174 This change allows numbers to be parsed as field names when pattern matching on structs, which allows us to provide better error messages when tuple structs are matched using a struct pattern. r? ``@estebank``
2021-02-21parser: remove unneccessary wrapping of return value in parse_extern()Matthias Krüger-6/+2
2021-02-15Simplify pattern grammar by allowing nested leading vertmark-0/+1
Along the way, we also implement a handful of diagnostics improvements and fixes, particularly with respect to the special handling of `||` in place of `|` and when there are leading verts in function params, which don't allow top-level or-patterns anyway.
2021-02-13Address review commentsAaron Hill-159/+6
2021-02-13Require passing an `AttrWrapper` to `collect_tokens_trailing_token`Aaron Hill-20/+41
This is a pure refactoring split out from #80689. It represents the most invasive part of that PR, requiring changes in every caller of `parse_outer_attributes` In order to eagerly expand `#[cfg]` attributes while preserving the original `TokenStream`, we need to know the range of tokens that corresponds to every attribute target. This is accomplished by making `parse_outer_attributes` return an opaque `AttrWrapper` struct. An `AttrWrapper` must be converted to a plain `AttrVec` by passing it to `collect_tokens_trailing_token`. This makes it difficult to accidentally construct an AST node with attributes without calling `collect_tokens_trailing_token`, since AST nodes store an `AttrVec`, not an `AttrWrapper`. As a result, we now call `collect_tokens_trailing_token` for attribute targets which only support inert attributes, such as generic arguments and struct fields. Currently, the constructed `LazyTokenStream` is simply discarded. Future PRs will record the token range corresponding to the attribute target, allowing those tokens to be removed from an enclosing `collect_tokens_trailing_token` call if necessary.
2021-01-28Clone entire `TokenCursor` when collecting tokensAaron Hill-9/+1
Reverts PR #80830 Fixes taiki-e/pin-project#312 We can have an arbitrary number of `None`-delimited group frames pushed on the stack due to proc-macro invocations, which can legally be exited. Attempting to account for this would add a lot of complexity for a tiny performance gain, so let's just use the original strategy.
2021-01-24parser: Collect tokens for values in key-value attributesVadim Petrochenkov-6/+2
2021-01-23Auto merge of #80065 - b-naber:parse-angle-arg-diagnostics, r=petrochenkovbors-1/+1
Improve diagnostics when parsing angle args https://github.com/rust-lang/rust/pull/79266 introduced parsing of generic arguments in associated type constraints, this however resulted in possibly very confusing error messages in cases in which closing angle brackets were missing such as in `Vec<(u32, _, _) = vec![]`, which outputs an incorrectly parsed equality constraint error, as noted by `@cynecx.` This PR tries to provide better error messages in such cases. r? `@petrochenkov`
2021-01-22improve diagnostics for angle argsb-naber-1/+1
2021-01-22Refactor token collection to capture trailing token immediatelyAaron Hill-27/+28
2021-01-20Improve suggestion for tuple struct pattern matching errors.Reese Williams-1/+1
Currently, when a user uses a struct pattern to pattern match on a tuple struct, the errors we emit generally suggest adding fields using their field names, which are numbers. However, numbers are not valid identifiers, so the suggestions, which use the shorthand notation, are not valid syntax. This commit changes those errors to suggest using the actual tuple struct pattern syntax instead, which is a more actionable suggestion.
2021-01-20Force token collection to run when parsing nonterminalsAaron Hill-0/+20
Fixes #81007 Previously, we would fail to collect tokens in the proper place when only builtin attributes were present. As a result, we would end up with attribute tokens in the collected `TokenStream`, leading to duplication when we attempted to prepend the attributes from the AST node. We now explicitly track when token collection must be performed due to nomterminal parsing.
2021-01-13Set tokens on AST node in `collect_tokens`Aaron Hill-6/+7
A new `HasTokens` trait is introduced, which is used to move logic from the callers of `collect_tokens` into the body of `collect_tokens`. In addition to reducing duplication, this paves the way for PR #80689, which needs to perform additional logic during token collection.
2021-01-09Auto merge of #80441 - petrochenkov:kwtok, r=Aaron1011bors-2/+2
ast: Remove some indirection layers from values in key-value attributes Trying to address some perf regressions from https://github.com/rust-lang/rust/pull/78837#issuecomment-745380762.
2021-01-09ast: Remove some indirection layers from values in key-value attributesVadim Petrochenkov-2/+2
2021-01-08Use an empty `TokenCursorFrame` stack when capturing tokensAaron Hill-1/+9
We will never need to pop past our starting frame during token capturing. Using an empty stack allows us to avoid pointless heap allocations/deallocations.
2020-12-31Auto merge of #80459 - mark-i-m:or-pat-reg, r=petrochenkovbors-1/+0
Implement edition-based macro :pat feature This PR does two things: 1. Fixes the perf regression from https://github.com/rust-lang/rust/pull/80100#issuecomment-750893149 2. Implements `:pat2018` and `:pat2021` matchers, as described by `@joshtriplett` in https://github.com/rust-lang/rust/issues/54883#issuecomment-745509090 behind the feature gate `edition_macro_pat`. r? `@petrochenkov` cc `@Mark-Simulacrum`
2020-12-30Implement edition-based macro pat featuremark-1/+0
2020-12-30Fix ICE when pointing at multi bytes characterYuki Okushi-5/+1
2020-12-19implement edition-specific :pat behavior for 2015/18mark-0/+1
2020-12-12Properly capture trailing 'unglued' tokenAaron Hill-9/+58
If we try to capture the `Vec<u8>` in `Option<Vec<u8>>`, we'll need to capture a `>` token which was 'unglued' from a `>>` token. The processing of unglueing a token for parsing purposes bypasses the usual capturing infrastructure, so we currently lose the trailing `>`. As a result, we fall back to the reparsed `TokenStream`, causing us to lose spans. This commit makes token capturing keep track of a trailing 'unglued' token. Note that we don't need to care about unglueing except at the end of the captured tokens - if we capture both the first and second unglued tokens, then we'll end up capturing the full 'glued' token, which already works correctly.
2020-12-09Accept arbitrary expressions in key-value attributes at parse timeVadim Petrochenkov-9/+18
2020-12-04A slightly clearer diagnostic when misusingRyan Levick-1/+1
2020-11-28Rollup merge of #78853 - calebcartwright:fix-const-block-expr-span, r=spastorinoJonas Schievink-1/+2
rustc_parse: fix ConstBlock expr span The span for a ConstBlock expression should presumably run through the end of the block it contains and not stop at the keyword, just like is done with similar block-containing expression kinds, such as a TryBlock
2020-11-26Properly handle attributes on statementsAaron Hill-2/+17
We now collect tokens for the underlying node wrapped by `StmtKind` instead of storing tokens directly in `Stmt`. `LazyTokenStream` now supports capturing a trailing semicolon after it is initially constructed. This allows us to avoid refactoring statement parsing to wrap the parsing of the semicolon in `parse_tokens`. Attributes on item statements (e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as item attributes, not statement attributes, which is consistent with how we handle attributes on other kinds of statements. The feature-gating code is adjusted so that proc-macro attributes are still allowed on item statements on stable. Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be adjusted to support being passed `Annotatable::Stmt`.
2020-11-12rustc_parse: Remove optimization for 0-length streams in `collect_tokens`Vadim Petrochenkov-9/+5
The optimization conflates empty token streams with unknown token stream, which is at least suspicious, and doesn't affect performance because 0-length token streams are very rare.
2020-11-07fix(rustc_parse): ConstBlock expr spanCaleb Cartwright-1/+2
2020-10-31parser: Cleanup `LazyTokenStream` and avoid some clonesVadim Petrochenkov-26/+37
by using a named struct instead of a closure.
2020-10-27Auto merge of #77502 - varkor:const-generics-suggest-enclosing-braces, ↵bors-0/+1
r=petrochenkov Suggest that expressions that look like const generic arguments should be enclosed in brackets I pulled out the changes for const expressions from https://github.com/rust-lang/rust/pull/71592 (without the trait object diagnostic changes) and made some small changes; the implementation is `@estebank's.` We're also going to want to make some changes separately to account for trait objects (they result in poor diagnostics, as is evident from one of the test cases here), such as an adaption of https://github.com/rust-lang/rust/pull/72273. Fixes https://github.com/rust-lang/rust/issues/70753. r? `@petrochenkov`
2020-10-26Suggest expressions that look like const generic arguments should be ↵varkor-0/+1
enclosed in brackets Co-Authored-By: Esteban Kuber <github@kuber.com.ar>
2020-10-24Auto merge of #77255 - Aaron1011:feature/collect-attr-tokens, r=petrochenkovbors-4/+10
Unconditionally capture tokens for attributes. This allows us to avoid synthesizing tokens in `prepend_attr`, since we have the original tokens available. We still need to synthesize tokens when expanding `cfg_attr`, but this is an unavoidable consequence of the syntax of `cfg_attr` - the user does not supply the `#` and `[]` tokens that a `cfg_attr` expands to. This is based on PR https://github.com/rust-lang/rust/pull/77250 - this PR exposes a bug in the current `collect_tokens` implementation, which is fixed by the rewrite.
2020-10-22Make inline const work for half open rangesSantiago Pastorino-3/+3
2020-10-22Rename parse_const_expr to parse_const_blockSantiago Pastorino-1/+1
2020-10-22Don't create an empty `LazyTokenStream`Aaron Hill-4/+10
2020-10-21Auto merge of #77250 - Aaron1011:feature/flat-token-collection, r=petrochenkovbors-134/+131
Rewrite `collect_tokens` implementations to use a flattened buffer Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR #76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.