about summary refs log tree commit diff
path: root/compiler/rustc_parse/src/lexer/mod.rs
AgeCommit message (Collapse)AuthorLines
2024-04-18Rollup merge of #123752 - estebank:emoji-prefix, r=wesleywiserJubilee-1/+4
Properly handle emojis as literal prefix in macros Do not accept the following ```rust macro_rules! lexes {($($_:tt)*) => {}} lexes!(🐛"foo"); ``` Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro pre-expansion of literal prefixes. Fix #123696.
2024-04-18Simplify `static_assert_size`s.Nicholas Nethercote-1/+1
We want to run them on all 64-bit platforms.
2024-04-12Rollup merge of #123223 - estebank:issue-123079, r=pnkfelixMatthias Krüger-13/+7
Fix invalid silencing of parsing error Given ```rust macro_rules! a { ( ) => { impl<'b> c for d { e::<f'g> } }; } ``` ensure an error is emitted. Fix #123079.
2024-04-10Properly handle emojis as literal prefix in macrosEsteban Küber-1/+4
Do not accept the following ```rust macro_rules! lexes {($($_:tt)*) => {}} lexes!(🐛"foo"); ``` Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro expansion of literal prefixes. Fix #123696.
2024-04-08parser: reduce visibility of unnecessary public `UnmatchedDelim`Yutaro Ohno-2/+1
`lexer::UnmatchedDelim` struct in `rustc_parse` is unnecessary public outside of the crate. This commit reduces the visibility to `pub(crate)`. Beside, this removes unnecessary field `expected_delim` that causes warnings after changing the visibility.
2024-04-07Fix invalid silencing of parsing errorEsteban Küber-13/+7
Given ```rust macro_rules! a { ( ) => { impl<'b> c for d { e::<f'g> } }; } ``` ensure an error is emitted. Fix #123079.
2024-04-03Check `x86_64` size assertions on `aarch64`, tooZalathar-1/+1
This makes it easier for contributors on aarch64 workstations (e.g. Macs) to notice when these assertions have been violated.
2024-03-17Silence redundant error on char literal that was meant to be a string in ↵Esteban Küber-1/+10
2021 edition
2024-03-17review comment: `str` -> string in messagesEsteban Küber-1/+1
2024-03-17Handle str literals written with `'` lexed as lifetimeEsteban Küber-4/+42
Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal: ``` error[E0762]: unterminated character literal --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26 | LL | println!('hello world'); | ^^^^ | help: if you meant to write a `str` literal, use double quotes | LL | println!("hello world"); | ~ ~ ``` ``` error[E0762]: unterminated character literal --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20 | LL | println!('1 + 1'); | ^^^^ | help: if you meant to write a `str` literal, use double quotes | LL | println!("1 + 1"); | ~ ~ ``` Fix #119685.
2024-03-05Rename `BuiltinLintDiagnostics` as `BuiltinLintDiag`.Nicholas Nethercote-3/+3
Not the dropping of the trailing `s` -- this type describes a single diagnostic and its name should be singular.
2024-03-05Rename all `ParseSess` variables/fields/lifetimes as `psess`.Nicholas Nethercote-18/+18
Existing names for values of this type are `sess`, `parse_sess`, `parse_session`, and `ps`. `sess` is particularly annoying because that's also used for `Session` values, which are often co-located, and it can be difficult to know which type a value named `sess` refers to. (That annoyance is the main motivation for this change.) `psess` is nice and short, which is good for a name used this much. The commit also renames some `parse_sess_created` values as `psess_created`.
2024-02-29Rollup merge of #121724 - nnethercote:LitKind-Err-for-floats, r=fmeaseMatthias Krüger-3/+7
Use `LitKind::Err` for malformed floats #121120 changed `StringReader::cook_lexer_literal` to return `LitKind::Err` for malformed integer literals. This commit does the same for float literals, for consistency. r? ``@fmease``
2024-02-28Use `LitKind::Err` for floats with unsupported bases.Nicholas Nethercote-1/+3
This slightly changes error messages in `float-field.rs`, but nothing of real importance.
2024-02-28Use `LitKind::Err` for floats with empty exponents.Nicholas Nethercote-2/+4
This prevents a follow-up type error in a test, which seems fine.
2024-02-28Rename `DiagnosticBuilder` as `Diag`.Nicholas Nethercote-2/+2
Much better! Note that this involves renaming (and updating the value of) `DIAGNOSTIC_BUILDER` in clippy.
2024-02-20Add newtype for raw identsclubby789-4/+4
2024-02-15Add `ErrorGuaranteed` to `ast::LitKind::Err`, `token::LitKind::Err`.Nicholas Nethercote-10/+12
This mostly works well, and eliminates a couple of delayed bugs. One annoying thing is that we should really also add an `ErrorGuaranteed` to `proc_macro::bridge::LitKind::Err`. But that's difficult because `proc_macro` doesn't have access to `ErrorGuaranteed`, so we have to fake it.
2024-02-15Make `emit_unescape_error` return `Option<ErrorGuaranteed>`.Nicholas Nethercote-8/+8
And use the result in `cook_common` to decide whether to return an error token.
2024-02-15Remove `LitError::LexerError`.Nicholas Nethercote-15/+15
`cook_lexer_literal` can emit an error about an invalid int literal but then return a non-`Err` token. And then `integer_lit` has to account for this to avoid printing a redundant error message. This commit changes `cook_lexer_literal` to return `Err` in that case. Then `integer_lit` doesn't need the special case, and `LitError::LexerError` can be removed.
2024-01-29Stop using `String` for error codes.Nicholas Nethercote-8/+8
Error codes are integers, but `String` is used everywhere to represent them. Gross! This commit introduces `ErrCode`, an integral newtype for error codes, replacing `String`. It also introduces a constant for every error code, e.g. `E0123`, and removes the `error_code!` macro. The constants are imported wherever used with `use rustc_errors::codes::*`. With the old code, we have three different ways to specify an error code at a use point: ``` error_code!(E0123) // macro call struct_span_code_err!(dcx, span, E0123, "msg"); // bare ident arg to macro call \#[diag(name, code = "E0123")] // string struct Diag; ``` With the new code, they all use the `E0123` constant. ``` E0123 // constant struct_span_code_err!(dcx, span, E0123, "msg"); // constant \#[diag(name, code = E0123)] // constant struct Diag; ``` The commit also changes the structure of the error code definitions: - `rustc_error_codes` now just defines a higher-order macro listing the used error codes and nothing else. - Because that's now the only thing in the `rustc_error_codes` crate, I moved it into the `lib.rs` file and removed the `error_codes.rs` file. - `rustc_errors` uses that macro to define everything, e.g. the error code constants and the `DIAGNOSTIC_TABLES`. This is in its new `codes.rs` file.
2024-01-25Use `unescape_unicode` for raw C string literals.Nicholas Nethercote-1/+1
They can't contain `\x` escapes, which means they can't contain high bytes, which means we can used `unescape_unicode` instead of `unescape_mixed` to unescape them. This avoids unnecessary used of `MixedUnit`.
2024-01-25Rename the unescaping functions.Nicholas Nethercote-12/+12
`unescape_literal` becomes `unescape_unicode`, and `unescape_c_string` becomes `unescape_mixed`. Because rfc3349 will mean that C string literals will no longer be the only mixed utf8 literals.
2024-01-11Stop using `DiagnosticBuilder::buffer` in the parser.Nicholas Nethercote-4/+4
One consequence is that errors returned by `maybe_new_parser_from_source_str` now must be consumed, so a bunch of places that previously ignored those errors now cancel them. (Most of them explicitly dropped the errors before. I guess that was to indicate "we are explicitly ignoring these", though I'm not 100% sure.)
2024-01-11Fix lifetimes in `StringReader`.Nicholas Nethercote-12/+12
Two different lifetimes are conflated. This doesn't matter right now, but needs to be fixed for the next commit to work. And the more descriptive lifetime names make the code easier to read.
2024-01-10Rename consuming chaining methods on `DiagnosticBuilder`.Nicholas Nethercote-6/+6
In #119606 I added them and used a `_mv` suffix, but that wasn't great. A `with_` prefix has three different existing uses. - Constructors, e.g. `Vec::with_capacity`. - Wrappers that provide an environment to execute some code, e.g. `with_session_globals`. - Consuming chaining methods, e.g. `Span::with_{lo,hi,ctxt}`. The third case is exactly what we want, so this commit changes `DiagnosticBuilder::foo_mv` to `DiagnosticBuilder::with_foo`. Thanks to @compiler-errors for the suggestion.
2024-01-08Remove `DiagnosticBuilder::delay_as_bug_without_consuming`.Nicholas Nethercote-1/+1
The existing uses are replaced in one of three ways. - In a function that also has calls to `emit`, just rearrange the code so that exactly one of `delay_as_bug` or `emit` is called on every path. - In a function returning a `DiagnosticBuilder`, use `downgrade_to_delayed_bug`. That's good enough because it will get emitted later anyway. - In `unclosed_delim_err`, one set of errors is being replaced with another set, so just cancel the original errors.
2024-01-08Remove all eight `DiagnosticBuilder::*_with_code` methods.Nicholas Nethercote-36/+37
These all have relatively low use, and can be perfectly emulated with a simpler construction method combined with `code` or `code_mv`.
2024-01-08Use chaining in `DiagnosticBuilder` construction.Nicholas Nethercote-3/+3
To avoid the use of a mutable local variable, and because it reads more nicely.
2024-01-04Inline and remove `StringReader::struct_fatal_span_char`.Nicholas Nethercote-22/+11
It has a single call site.
2024-01-03Rename some `Diagnostic` setters.Nicholas Nethercote-1/+1
`Diagnostic` has 40 methods that return `&mut Self` and could be considered setters. Four of them have a `set_` prefix. This doesn't seem necessary for a type that implements the builder pattern. This commit removes the `set_` prefixes on those four methods.
2023-12-24Remove `Session` methods that duplicate `DiagCtxt` methods.Nicholas Nethercote-8/+8
Also add some `dcx` methods to types that wrap `TyCtxt`, for easier access.
2023-12-24Remove `ParseSess` methods that duplicate `DiagCtxt` methods.Nicholas Nethercote-11/+15
Also add missing `#[track_caller]` attributes to `DiagCtxt` methods as necessary to keep tests working.
2023-12-19Add `EmitResult` associated type to `EmissionGuarantee`.Nicholas Nethercote-2/+4
This lets different error levels share the same return type from `emit_*`. - A lot of inconsistencies in the `DiagCtxt` API are removed. - `Noted` is removed. - `FatalAbort` is introduced for fatal errors (abort via `raise`), replacing the `EmissionGuarantee` impl for `!`. - `Bug` is renamed `BugAbort` (to avoid clashing with `Level::Bug` and to mirror `FatalAbort`), and modified to work in the new way with bug errors (abort via panic). - Various diagnostic creators and emitters updated to the new, better signatures. Note that `DiagCtxt::bug` no longer needs to call `panic_any`, because `emit` handles that. Also shorten the obnoxiously long `diagnostic_builder_emit_producing_guarantee` name.
2023-12-18Rename `ParseSess::span_diagnostic` as `ParseSess::dcx`.Nicholas Nethercote-10/+10
2023-12-11Add spacing information to delimiters.Nicholas Nethercote-1/+1
This is an extension of the previous commit. It means the output of something like this: ``` stringify!(let a: Vec<u32> = vec![];) ``` goes from this: ``` let a: Vec<u32> = vec![] ; ``` With this PR, it now produces this string: ``` let a: Vec<u32> = vec![]; ```
2023-12-01Auto merge of #117472 - jmillikin:stable-c-str-literals, r=Nilstriebbors-3/+0
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: https://github.com/rust-lang/rust/issues/105723 Documentation PR (reference manual): https://github.com/rust-lang/reference/pull/1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR https://github.com/rust-lang/rust/pull/108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR https://github.com/rust-lang/rust/pull/113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
2023-11-21Fix `clippy::needless_borrow` in the compilerNilstrieb-3/+3
`x clippy compiler -Aclippy::all -Wclippy::needless_borrow --fix`. Then I had to remove a few unnecessary parens and muts that were exposed now.
2023-11-01Stabilize C string literalsJohn Millikin-3/+0
2023-10-30When encountering unclosed delimiters during parsing, check for diff markersEsteban Küber-6/+8
Fix #116252.
2023-08-13Remove reached_eof from ParseSessbjorn3-1/+0
It was only ever set in a function which isn't called anywhere.
2023-07-25extract common codeDeadbeef-11/+10
2023-07-23reimplement C string literalsDeadbeef-1/+25
2023-05-16Avoid `&format("...")` calls in error message code.Nicholas Nethercote-1/+1
Error message all end up passing into a function as an `impl Into<{D,Subd}iagnosticMessage>`. If an error message is creatd as `&format("...")` that means we allocate a string (in the `format!` call), then take a reference, and then clone (allocating again) the reference to produce the `{D,Subd}iagnosticMessage`, which is silly. This commit removes the leading `&` from a lot of these cases. This means the original `String` is moved into the `{D,Subd}iagnosticMessage`, avoiding the double allocations. This requires changing some function argument types from `&str` to `String` (when all arguments are `String`) or `impl Into<{D,Subd}iagnosticMessage>` (when some arguments are `String` and some are `&str`).
2023-05-05Rollup merge of #108801 - fee1-dead-contrib:c-str, r=compiler-errorsDylan DPC-3/+60
Implement RFC 3348, `c"foo"` literals RFC: https://github.com/rust-lang/rfcs/pull/3348 Tracking issue: #105723
2023-05-03Restrict `From<S>` for `{D,Subd}iagnosticMessage`.Nicholas Nethercote-2/+2
Currently a `{D,Subd}iagnosticMessage` can be created from any type that impls `Into<String>`. That includes `&str`, `String`, and `Cow<'static, str>`, which are reasonable. It also includes `&String`, which is pretty weird, and results in many places making unnecessary allocations for patterns like this: ``` self.fatal(&format!(...)) ``` This creates a string with `format!`, takes a reference, passes the reference to `fatal`, which does an `into()`, which clones the reference, doing a second allocation. Two allocations for a single string, bleh. This commit changes the `From` impls so that you can only create a `{D,Subd}iagnosticMessage` from `&str`, `String`, or `Cow<'static, str>`. This requires changing all the places that currently create one from a `&String`. Most of these are of the `&format!(...)` form described above; each one removes an unnecessary static `&`, plus an allocation when executed. There are also a few places where the existing use of `&String` was more reasonable; these now just use `clone()` at the call site. As well as making the code nicer and more efficient, this is a step towards possibly using `Cow<'static, str>` in `{D,Subd}iagnosticMessage::{Str,Eager}`. That would require changing the `From<&'a str>` impls to `From<&'static str>`, which is doable, but I'm not yet sure if it's worthwhile.
2023-05-02make cook genericDeadbeef-37/+27
2023-05-02make it semantic errorDeadbeef-0/+3
2023-05-02initial step towards implementing C string literalsDeadbeef-0/+64
2023-04-25Fix static string lintsclubby789-5/+1