diff options
| author | Nicholas Nethercote <n.nethercote@gmail.com> | 2024-08-01 06:44:39 +1000 |
|---|---|---|
| committer | Nicholas Nethercote <n.nethercote@gmail.com> | 2024-08-01 19:30:40 +1000 |
| commit | d1f05fd1848fc68ed89d17f7937e358dacd8aed4 (patch) | |
| tree | 261e4c4a06d275386cbe7a6621a3fd524eea5b4b /compiler/rustc_parse/src/parser/mod.rs | |
| parent | 9d77d17f7127def53c6a3555ae9a6a1f271a8e37 (diff) | |
| download | rust-d1f05fd1848fc68ed89d17f7937e358dacd8aed4.tar.gz rust-d1f05fd1848fc68ed89d17f7937e358dacd8aed4.zip | |
Distinguish the two kinds of token range.
When collecting tokens there are two kinds of range: - a range relative to the parser's full token stream (which we get when we are parsing); - a range relative to a single AST node's token stream (which we use within `LazyAttrTokenStreamImpl` when replacing tokens). These are currently both represented with `Range<u32>` and it's easy to mix them up -- until now I hadn't properly understood the difference. This commit introduces `ParserRange` and `NodeRange` to distinguish them. This also requires splitting `ReplaceRange` in two, giving the new types `ParserReplacement` and `NodeReplacement`. (These latter two names reduce the overloading of the word "range".) The commit also rewrites some comments to be clearer. The end result is a little more verbose, but much clearer.
Diffstat (limited to 'compiler/rustc_parse/src/parser/mod.rs')
| -rw-r--r-- | compiler/rustc_parse/src/parser/mod.rs | 70 |
1 files changed, 50 insertions, 20 deletions
diff --git a/compiler/rustc_parse/src/parser/mod.rs b/compiler/rustc_parse/src/parser/mod.rs index 26ee5bfdee4..722fb41cd81 100644 --- a/compiler/rustc_parse/src/parser/mod.rs +++ b/compiler/rustc_parse/src/parser/mod.rs @@ -192,24 +192,54 @@ struct ClosureSpans { body: Span, } -/// Indicates a range of tokens that should be replaced by -/// the tokens in the provided `AttrsTarget`. This is used in two -/// places during token collection: +/// A token range within a `Parser`'s full token stream. +#[derive(Clone, Debug)] +struct ParserRange(Range<u32>); + +/// A token range within an individual AST node's (lazy) token stream, i.e. +/// relative to that node's first token. Distinct from `ParserRange` so the two +/// kinds of range can't be mixed up. +#[derive(Clone, Debug)] +struct NodeRange(Range<u32>); + +/// Indicates a range of tokens that should be replaced by an `AttrsTarget` +/// (replacement) or be replaced by nothing (deletion). This is used in two +/// places during token collection. +/// +/// 1. Replacement. During the parsing of an AST node that may have a +/// `#[derive]` attribute, when we parse a nested AST node that has `#[cfg]` +/// or `#[cfg_attr]`, we replace the entire inner AST node with +/// `FlatToken::AttrsTarget`. This lets us perform eager cfg-expansion on an +/// `AttrTokenStream`. /// -/// 1. During the parsing of an AST node that may have a `#[derive]` -/// attribute, we parse a nested AST node that has `#[cfg]` or `#[cfg_attr]` -/// In this case, we use a `ReplaceRange` to replace the entire inner AST node -/// with `FlatToken::AttrsTarget`, allowing us to perform eager cfg-expansion -/// on an `AttrTokenStream`. +/// 2. Deletion. We delete inner attributes from all collected token streams, +/// and instead track them through the `attrs` field on the AST node. This +/// lets us manipulate them similarly to outer attributes. When we create a +/// `TokenStream`, the inner attributes are inserted into the proper place +/// in the token stream. /// -/// 2. When we parse an inner attribute while collecting tokens. We -/// remove inner attributes from the token stream entirely, and -/// instead track them through the `attrs` field on the AST node. -/// This allows us to easily manipulate them (for example, removing -/// the first macro inner attribute to invoke a proc-macro). -/// When create a `TokenStream`, the inner attributes get inserted -/// into the proper place in the token stream. -type ReplaceRange = (Range<u32>, Option<AttrsTarget>); +/// Each replacement starts off in `ParserReplacement` form but is converted to +/// `NodeReplacement` form when it is attached to a single AST node, via +/// `LazyAttrTokenStreamImpl`. +type ParserReplacement = (ParserRange, Option<AttrsTarget>); + +/// See the comment on `ParserReplacement`. +type NodeReplacement = (NodeRange, Option<AttrsTarget>); + +impl NodeRange { + // Converts a range within a parser's tokens to a range within a + // node's tokens beginning at `start_pos`. + // + // For example, imagine a parser with 50 tokens in its token stream, a + // function that spans `ParserRange(20..40)` and an inner attribute within + // that function that spans `ParserRange(30..35)`. We would find the inner + // attribute's range within the function's tokens by subtracting 20, which + // is the position of the function's start token. This gives + // `NodeRange(10..15)`. + fn new(ParserRange(parser_range): ParserRange, start_pos: u32) -> NodeRange { + NodeRange((parser_range.start - start_pos)..(parser_range.end - start_pos)) + } +} /// Controls how we capture tokens. Capturing can be expensive, /// so we try to avoid performing capturing in cases where @@ -226,8 +256,8 @@ enum Capturing { #[derive(Clone, Debug)] struct CaptureState { capturing: Capturing, - replace_ranges: Vec<ReplaceRange>, - inner_attr_ranges: FxHashMap<AttrId, Range<u32>>, + parser_replacements: Vec<ParserReplacement>, + inner_attr_parser_ranges: FxHashMap<AttrId, ParserRange>, } /// Iterator over a `TokenStream` that produces `Token`s. It's a bit odd that @@ -417,8 +447,8 @@ impl<'a> Parser<'a> { subparser_name, capture_state: CaptureState { capturing: Capturing::No, - replace_ranges: Vec::new(), - inner_attr_ranges: Default::default(), + parser_replacements: Vec::new(), + inner_attr_parser_ranges: Default::default(), }, current_closure: None, recovery: Recovery::Allowed, |
