diff options
| author | Aaron Hill <aa1ronham@gmail.com> | 2020-09-26 21:56:29 -0400 |
|---|---|---|
| committer | Aaron Hill <aa1ronham@gmail.com> | 2020-10-19 13:59:18 -0400 |
| commit | 593fdd3d45d7565e34dc429788fa81ca2e25a2d4 (patch) | |
| tree | 2d3ae4c7a1cb800273757906d1464db7333ee977 /compiler/rustc_parse/src/lib.rs | |
| parent | cb2462c53f2cc3f140c0f1ea0976261cab968a34 (diff) | |
| download | rust-593fdd3d45d7565e34dc429788fa81ca2e25a2d4.tar.gz rust-593fdd3d45d7565e34dc429788fa81ca2e25a2d4.zip | |
Rewrite `collect_tokens` implementations to use a flattened buffer
Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR #76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.
Diffstat (limited to 'compiler/rustc_parse/src/lib.rs')
| -rw-r--r-- | compiler/rustc_parse/src/lib.rs | 23 |
1 files changed, 13 insertions, 10 deletions
diff --git a/compiler/rustc_parse/src/lib.rs b/compiler/rustc_parse/src/lib.rs index 9a187c6285e..e073f571088 100644 --- a/compiler/rustc_parse/src/lib.rs +++ b/compiler/rustc_parse/src/lib.rs @@ -8,7 +8,7 @@ use rustc_ast as ast; use rustc_ast::token::{self, DelimToken, Nonterminal, Token, TokenKind}; -use rustc_ast::tokenstream::{self, TokenStream, TokenTree}; +use rustc_ast::tokenstream::{self, LazyTokenStream, TokenStream, TokenTree}; use rustc_ast_pretty::pprust; use rustc_data_structures::sync::Lrc; use rustc_errors::{Diagnostic, FatalError, Level, PResult}; @@ -248,29 +248,32 @@ pub fn nt_to_tokenstream(nt: &Nonterminal, sess: &ParseSess, span: Span) -> Toke // As a result, some AST nodes are annotated with the token stream they // came from. Here we attempt to extract these lossless token streams // before we fall back to the stringification. + + let convert_tokens = |tokens: Option<LazyTokenStream>| tokens.map(|t| t.into_token_stream()); + let tokens = match *nt { Nonterminal::NtItem(ref item) => { prepend_attrs(sess, &item.attrs, item.tokens.as_ref(), span) } - Nonterminal::NtBlock(ref block) => block.tokens.clone(), + Nonterminal::NtBlock(ref block) => convert_tokens(block.tokens.clone()), Nonterminal::NtStmt(ref stmt) => { // FIXME: We currently only collect tokens for `:stmt` // matchers in `macro_rules!` macros. When we start collecting // tokens for attributes on statements, we will need to prepend // attributes here - stmt.tokens.clone() + convert_tokens(stmt.tokens.clone()) } - Nonterminal::NtPat(ref pat) => pat.tokens.clone(), - Nonterminal::NtTy(ref ty) => ty.tokens.clone(), + Nonterminal::NtPat(ref pat) => convert_tokens(pat.tokens.clone()), + Nonterminal::NtTy(ref ty) => convert_tokens(ty.tokens.clone()), Nonterminal::NtIdent(ident, is_raw) => { Some(tokenstream::TokenTree::token(token::Ident(ident.name, is_raw), ident.span).into()) } Nonterminal::NtLifetime(ident) => { Some(tokenstream::TokenTree::token(token::Lifetime(ident.name), ident.span).into()) } - Nonterminal::NtMeta(ref attr) => attr.tokens.clone(), - Nonterminal::NtPath(ref path) => path.tokens.clone(), - Nonterminal::NtVis(ref vis) => vis.tokens.clone(), + Nonterminal::NtMeta(ref attr) => convert_tokens(attr.tokens.clone()), + Nonterminal::NtPath(ref path) => convert_tokens(path.tokens.clone()), + Nonterminal::NtVis(ref vis) => convert_tokens(vis.tokens.clone()), Nonterminal::NtTT(ref tt) => Some(tt.clone().into()), Nonterminal::NtExpr(ref expr) | Nonterminal::NtLiteral(ref expr) => { if expr.tokens.is_none() { @@ -602,10 +605,10 @@ fn token_probably_equal_for_proc_macro(first: &Token, other: &Token) -> bool { fn prepend_attrs( sess: &ParseSess, attrs: &[ast::Attribute], - tokens: Option<&tokenstream::TokenStream>, + tokens: Option<&tokenstream::LazyTokenStream>, span: rustc_span::Span, ) -> Option<tokenstream::TokenStream> { - let tokens = tokens?; + let tokens = tokens?.clone().into_token_stream(); if attrs.is_empty() { return Some(tokens.clone()); } |
