about summary refs log tree commit diff
path: root/compiler/rustc_parse/src/lexer
AgeCommit message (Collapse)AuthorLines
2024-12-18Re-export more `rustc_span::symbol` things from `rustc_span`.Nicholas Nethercote-4/+2
`rustc_span::symbol` defines some things that are re-exported from `rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some closely related things such as `Ident` and `kw`. So you can do `use rustc_span::{Symbol, sym}` but you have to do `use rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good reason. This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`, and changes many `rustc_span::symbol::` qualifiers in `compiler/` to `rustc_span::`. This is a 200+ net line of code reduction, mostly because many files with two `use rustc_span` items can be reduced to one.
2024-12-13Remove `Lexer`'s dependency on `Parser`.Nicholas Nethercote-97/+37
Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`. But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created. This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form. The cost is slightly worse error messages in two obscure cases, as shown in these tests: - tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`. - tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser). In my opinion this cost is outweighed by the magnitude of the code cleanup.
2024-12-12Remove `PErr`.Nicholas Nethercote-7/+7
It's just a synonym for `Diag` that adds no value and is only used in a few places.
2024-12-09Fix typo in RFC mention 3598 -> 3593Esteban Küber-1/+1
https://github.com/rust-lang/rfcs/blob/master/text/3593-unprefixed-guarded-strings.md
2024-12-01Only error raw lifetime followed by \' in edition 2021+Michael Goulet-2/+21
2024-11-28Rollup merge of #133487 - pitaj:reserve-guarded-strings, r=fee1-deadGuillaume Gomez-6/+10
fix confusing diagnostic for reserved `##` Closes #131615
2024-11-25fix confusing diagnostic for reserved `##`Peter Jaszkowiak-6/+10
2024-11-25Streamline `lex_token_trees` error handling.Nicholas Nethercote-20/+14
- Use iterators instead of `for` loops. - Use `if`/`else` instead of `match`.
2024-11-25Fix some formatting.Nicholas Nethercote-5/+15
Must be one of those cases where the function is too long and rustfmt bails out.
2024-11-25Split `Lexer::bump`.Nicholas Nethercote-7/+27
It has two different ways of being called.
2024-11-25Merge `TokenTreesReader` into `StringReader`.Nicholas Nethercote-49/+31
There is a not-very-useful layering in the lexer, where `TokenTreesReader` contains a `StringReader`. This commit combines them and names the result `Lexer`, which is a more obvious name for it. The methods of `Lexer` are now split across `mod.rs` and `tokentrees.rs` which isn't ideal, but it doesn't seem worth moving a bunch of code to avoid it.
2024-11-21Prepare for invisible delimiters.Nicholas Nethercote-4/+12
Current places where `Interpolated` is used are going to change to instead use invisible delimiters. This prepares for that. - It adds invisible delimiter cases to the `can_begin_*`/`may_be_*` methods and the `failed_to_match_macro` that are equivalent to the existing `Interpolated` cases. - It adds panics/asserts in some places where invisible delimiters should never occur. - In `Parser::parse_struct_fields` it excludes an ident + invisible delimiter from special consideration in an error message, because that's quite different to an ident + paren/brace/bracket.
2024-11-19Remove `TokenKind::InvalidPrefix`.Nicholas Nethercote-3/+2
It was added in #123752 to handle some cases involving emoji, but it isn't necessary because it's always treated the same as `TokenKind::InvalidIdent`. This commit removes it, which makes things a little simpler.
2024-10-30Enforce that raw lifetime identifiers must be valid raw identifiersMichael Goulet-4/+10
2024-10-23"innermost", "outermost", "leftmost", and "rightmost" don't need hyphensJosh Triplett-1/+1
These are all standard dictionary words and don't require hyphenation.
2024-10-08Reserve guarded string literals (RFC 3593)Peter Jaszkowiak-1/+83
2024-09-22Reformat using the new identifier sorting from rustfmtMichael Goulet-5/+5
2024-09-17Store raw ident span for raw lifetimeMichael Goulet-0/+3
2024-09-09Remove needless returns detected by clippy in the compilerEduardo Sánchez Muñoz-1/+1
2024-09-06Add some more testsMichael Goulet-1/+1
2024-09-06Add initial support for raw lifetimesMichael Goulet-3/+36
2024-09-06Format lexerMichael Goulet-19/+22
2024-09-06Reserve prefix lifetimes tooMichael Goulet-0/+10
2024-08-14Use `impl PartialEq<TokenKind> for Token` more.Nicholas Nethercote-1/+1
This lets us compare a `Token` with a `TokenKind`. It's used a lot, but can be used even more, avoiding the need for some `.kind` uses.
2024-07-30Auto merge of #127955 - ↵bors-3/+18
chenyukang:yukang-fix-mismatched-delimiter-issue-12786, r=nnethercote Add limit for unclosed delimiters in lexer diagnostic Fixes #127868 The first commit shows the original diagnostic, and the second commit shows the changes.
2024-07-29Reformat `use` declarations.Nicholas Nethercote-19/+23
The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.
2024-07-25add limit for unclosed delimiters in lexer diagnosticyukang-3/+18
2024-06-18Use a dedicated type instead of a reference for the diagnostic contextOli Scherer-4/+4
This paves the way for tracking more state (e.g. error tainting) in the diagnostic context handle
2024-06-18Prefer `dcx` methods over fields or fields' methodsOli Scherer-8/+7
2024-06-05Remove `stream_to_parser`.Nicholas Nethercote-1/+2
It's a zero-value wrapper of `Parser::new`.
2024-06-05Don't use the word "parse" for lexing operations.Nicholas Nethercote-27/+24
Lexing converts source text into a token stream. Parsing converts a token stream into AST fragments. This commit renames several lexing operations that have "parse" in the name. I think these names have been subtly confusing me for years. This is just a `s/parse/lex/` on function names, with one exception: `parse_stream_from_source_str` becomes `source_str_to_stream`, to make it consistent with the existing `source_file_to_stream`. The commit also moves that function's location in the file to be just above `source_file_to_stream`. The commit also cleans up a few comments along the way.
2024-06-05`UNICODE_ARRAY` and `ASCII_ARRAY` fixes.Nicholas Nethercote-37/+38
- Avoid unnecessary escaping of single quotes within string literals. - Add a missing blank line between two `UNICODE_ARRAY` sections.
2024-05-23Remove `#[macro_use] extern crate tracing` from `rustc_parse`.Nicholas Nethercote-0/+2
2024-05-21Rename buffer_lint_with_diagnostic to buffer_lintXiretza-2/+2
2024-05-21Generate lint diagnostic message from BuiltinLintDiagXiretza-3/+1
Translation of the lint message happens when the actual diagnostic is created, not when the lint is buffered. Generating the message from BuiltinLintDiag ensures that all required data to construct the message is preserved in the LintBuffer, eventually allowing the messages to be moved to fluent. Remove the `msg` field from BufferedEarlyLint, it is either generated from the data in the BuiltinLintDiag or stored inside BuiltinLintDiag::Normal.
2024-05-17Clarify that the diff_marker is talking about version control systemardi-1/+1
conflicts specifically and a few more improvements.
2024-05-07narrow down visibilities in `rustc_parse::lexer`Lin Yihai-6/+6
2024-04-18Rollup merge of #123752 - estebank:emoji-prefix, r=wesleywiserJubilee-1/+4
Properly handle emojis as literal prefix in macros Do not accept the following ```rust macro_rules! lexes {($($_:tt)*) => {}} lexes!(🐛"foo"); ``` Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro pre-expansion of literal prefixes. Fix #123696.
2024-04-18Simplify `static_assert_size`s.Nicholas Nethercote-1/+1
We want to run them on all 64-bit platforms.
2024-04-12Rollup merge of #123223 - estebank:issue-123079, r=pnkfelixMatthias Krüger-13/+7
Fix invalid silencing of parsing error Given ```rust macro_rules! a { ( ) => { impl<'b> c for d { e::<f'g> } }; } ``` ensure an error is emitted. Fix #123079.
2024-04-10Properly handle emojis as literal prefix in macrosEsteban Küber-1/+4
Do not accept the following ```rust macro_rules! lexes {($($_:tt)*) => {}} lexes!(🐛"foo"); ``` Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro expansion of literal prefixes. Fix #123696.
2024-04-08parser: reduce visibility of unnecessary public `UnmatchedDelim`Yutaro Ohno-5/+2
`lexer::UnmatchedDelim` struct in `rustc_parse` is unnecessary public outside of the crate. This commit reduces the visibility to `pub(crate)`. Beside, this removes unnecessary field `expected_delim` that causes warnings after changing the visibility.
2024-04-07Fix invalid silencing of parsing errorEsteban Küber-13/+7
Given ```rust macro_rules! a { ( ) => { impl<'b> c for d { e::<f'g> } }; } ``` ensure an error is emitted. Fix #123079.
2024-04-03Check `x86_64` size assertions on `aarch64`, tooZalathar-1/+1
This makes it easier for contributors on aarch64 workstations (e.g. Macs) to notice when these assertions have been violated.
2024-03-17fix rustdoc testEsteban Küber-1/+1
2024-03-17Silence redundant error on char literal that was meant to be a string in ↵Esteban Küber-1/+10
2021 edition
2024-03-17review comment: `str` -> string in messagesEsteban Küber-1/+1
2024-03-17Use shorter span for existing `'` -> `"` structured suggestionEsteban Küber-5/+15
2024-03-17Handle str literals written with `'` lexed as lifetimeEsteban Küber-4/+42
Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal: ``` error[E0762]: unterminated character literal --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26 | LL | println!('hello world'); | ^^^^ | help: if you meant to write a `str` literal, use double quotes | LL | println!("hello world"); | ~ ~ ``` ``` error[E0762]: unterminated character literal --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20 | LL | println!('1 + 1'); | ^^^^ | help: if you meant to write a `str` literal, use double quotes | LL | println!("1 + 1"); | ~ ~ ``` Fix #119685.
2024-03-05Rename `BuiltinLintDiagnostics` as `BuiltinLintDiag`.Nicholas Nethercote-3/+3
Not the dropping of the trailing `s` -- this type describes a single diagnostic and its name should be singular.