Rewrite `collect_tokens` implementations to use a flattened buffer

Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR #76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.
author: Aaron Hill <aa1ronham@gmail.com> 2020-09-26 21:56:29 -0400
committer: Aaron Hill <aa1ronham@gmail.com> 2020-10-19 13:59:18 -0400
commit: 593fdd3d45d7565e34dc429788fa81ca2e25a2d4 (patch)
tree: 2d3ae4c7a1cb800273757906d1464db7333ee977 /compiler/rustc_parse/src/lib.rs
parent: cb2462c53f2cc3f140c0f1ea0976261cab968a34 (diff)
download: rust-593fdd3d45d7565e34dc429788fa81ca2e25a2d4.tar.gz
rust-593fdd3d45d7565e34dc429788fa81ca2e25a2d4.zip
1 files changed, 13 insertions, 10 deletions
diff --git a/compiler/rustc_parse/src/lib.rs b/compiler/rustc_parse/src/lib.rs
index 9a187c6285e..e073f571088 100644
--- a/compiler/rustc_parse/src/lib.rs
+++ b/compiler/rustc_parse/src/lib.rs
@@ -8,7 +8,7 @@
 
 use rustc_ast as ast;
 use rustc_ast::token::{self, DelimToken, Nonterminal, Token, TokenKind};
-use rustc_ast::tokenstream::{self, TokenStream, TokenTree};
+use rustc_ast::tokenstream::{self, LazyTokenStream, TokenStream, TokenTree};
 use rustc_ast_pretty::pprust;
 use rustc_data_structures::sync::Lrc;
 use rustc_errors::{Diagnostic, FatalError, Level, PResult};
@@ -248,29 +248,32 @@ pub fn nt_to_tokenstream(nt: &Nonterminal, sess: &ParseSess, span: Span) -> Toke
     // As a result, some AST nodes are annotated with the token stream they
     // came from. Here we attempt to extract these lossless token streams
     // before we fall back to the stringification.
+
+    let convert_tokens = |tokens: Option<LazyTokenStream>| tokens.map(|t| t.into_token_stream());
+
     let tokens = match *nt {
         Nonterminal::NtItem(ref item) => {
             prepend_attrs(sess, &item.attrs, item.tokens.as_ref(), span)
         }
-        Nonterminal::NtBlock(ref block) => block.tokens.clone(),
+        Nonterminal::NtBlock(ref block) => convert_tokens(block.tokens.clone()),
         Nonterminal::NtStmt(ref stmt) => {
             // FIXME: We currently only collect tokens for `:stmt`
             // matchers in `macro_rules!` macros. When we start collecting
             // tokens for attributes on statements, we will need to prepend
             // attributes here
-            stmt.tokens.clone()
+            convert_tokens(stmt.tokens.clone())
         }
-        Nonterminal::NtPat(ref pat) => pat.tokens.clone(),
-        Nonterminal::NtTy(ref ty) => ty.tokens.clone(),
+        Nonterminal::NtPat(ref pat) => convert_tokens(pat.tokens.clone()),
+        Nonterminal::NtTy(ref ty) => convert_tokens(ty.tokens.clone()),
         Nonterminal::NtIdent(ident, is_raw) => {
             Some(tokenstream::TokenTree::token(token::Ident(ident.name, is_raw), ident.span).into())
         }
         Nonterminal::NtLifetime(ident) => {
             Some(tokenstream::TokenTree::token(token::Lifetime(ident.name), ident.span).into())
         }
-        Nonterminal::NtMeta(ref attr) => attr.tokens.clone(),
-        Nonterminal::NtPath(ref path) => path.tokens.clone(),
-        Nonterminal::NtVis(ref vis) => vis.tokens.clone(),
+        Nonterminal::NtMeta(ref attr) => convert_tokens(attr.tokens.clone()),
+        Nonterminal::NtPath(ref path) => convert_tokens(path.tokens.clone()),
+        Nonterminal::NtVis(ref vis) => convert_tokens(vis.tokens.clone()),
         Nonterminal::NtTT(ref tt) => Some(tt.clone().into()),
         Nonterminal::NtExpr(ref expr) | Nonterminal::NtLiteral(ref expr) => {
             if expr.tokens.is_none() {
@@ -602,10 +605,10 @@ fn token_probably_equal_for_proc_macro(first: &Token, other: &Token) -> bool {
 fn prepend_attrs(
     sess: &ParseSess,
     attrs: &[ast::Attribute],
-    tokens: Option<&tokenstream::TokenStream>,
+    tokens: Option<&tokenstream::LazyTokenStream>,
     span: rustc_span::Span,
 ) -> Option<tokenstream::TokenStream> {
-    let tokens = tokens?;
+    let tokens = tokens?.clone().into_token_stream();
     if attrs.is_empty() {
         return Some(tokens.clone());
     }
author	Aaron Hill <aa1ronham@gmail.com>	2020-09-26 21:56:29 -0400
committer	Aaron Hill <aa1ronham@gmail.com>	2020-10-19 13:59:18 -0400
commit	593fdd3d45d7565e34dc429788fa81ca2e25a2d4 (patch)
tree	2d3ae4c7a1cb800273757906d1464db7333ee977 /compiler/rustc_parse/src/lib.rs
parent	cb2462c53f2cc3f140c0f1ea0976261cab968a34 (diff)
download	rust-593fdd3d45d7565e34dc429788fa81ca2e25a2d4.tar.gz rust-593fdd3d45d7565e34dc429788fa81ca2e25a2d4.zip