rust/compiler/rustc_builtin_macros/src/source_util.rs, branch 1.90.0

Improve path segment joining.

2025-07-16T22:37:19+00:00

There are many places that join path segments with `::` to produce a string. A lot of these use `join("::")`. Many in rustdoc use `join_with_double_colon`, and a few use `.joined("..")`. One in Clippy uses `itertools::join`. A couple of them look for `kw::PathRoot` in the first segment, which can be important. This commit introduces `rustc_ast::join_path_{syms,ident}` to do the joining for everyone. `rustc_ast` is as good a location for these as any, being the earliest-running of the several crates with a `Path` type. Two functions are needed because `Ident` printing is more complex than simple `Symbol` printing. The commit also removes `join_with_double_colon`, and `estimate_item_path_byte_length` with it. There are still a handful of places that join strings with "::" that are unchanged. They are not that important: some of them are in tests, and some of them first split a path around "::" and then rejoin with "::". This fixes one test case where `{{root}}` shows up in an error message.

Introduce `ByteSymbol`.

2025-06-30T10:42:27+00:00

It's like `Symbol` but for byte strings. The interner is now used for both `Symbol` and `ByteSymbol`. E.g. if you intern `"dog"` and `b"dog"` you'll get a `Symbol` and a `ByteSymbol` with the same index and the characters will only be stored once. The motivation for this is to eliminate the `Arc`s in `ast::LitKind`, to make `ast::LitKind` impl `Copy`, and to avoid the need to arena-allocate `ast::LitKind` in HIR. The latter change reduces peak memory by a non-trivial amount on literal-heavy benchmarks such as `deep-vector` and `tuple-stress`. `Encoder`, `Decoder`, `SpanEncoder`, and `SpanDecoder` all get some changes so that they can handle normal strings and byte strings. This change does slow down compilation of programs that use `include_bytes!` on large files, because the contents of those files are now interned (hashed). This makes `include_bytes!` more similar to `include_str!`, though `include_bytes!` contents still aren't escaped, and hashing is still much cheaper than escaping.

tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc`

2025-02-03T10:25:57+00:00

Point at invalid utf-8 span on user's source code

2025-01-22T00:52:27+00:00

``` error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8 --> $DIR/not-utf8-2.rs:6:5 | LL | include!("not-utf8-bin-file.rs"); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | note: `[193]` is not valid utf-8 --> $DIR/not-utf8-bin-file.rs:2:14 | LL | let _ = "�|�␂!5�cc␕␂��"; | ^ = note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info) ``` When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character. Fix #76869.

Re-export more `rustc_span::symbol` things from `rustc_span`.

2024-12-18T02:38:53+00:00

`rustc_span::symbol` defines some things that are re-exported from `rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some closely related things such as `Ident` and `kw`. So you can do `use rustc_span::{Symbol, sym}` but you have to do `use rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good reason. This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`, and changes many `rustc_span::symbol::` qualifiers in `compiler/` to `rustc_span::`. This is a 200+ net line of code reduction, mostly because many files with two `use rustc_span` items can be reduced to one.

Reformat using the new identifier sorting from rustfmt

2024-09-22T23:11:29+00:00

Reformat `use` declarations.

2024-07-28T22:26:52+00:00

The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.

Migrate some rustc_builtin_macros to SessionDiagnostic

2024-06-25T10:04:21+00:00

Signed-off-by: he1pa <18012015693@163.com>

Make top-level `rustc_parse` functions fallible.

2024-06-05T00:38:03+00:00

Currently we have an awkward mix of fallible and infallible functions: ``` new_parser_from_source_str maybe_new_parser_from_source_str new_parser_from_file (maybe_new_parser_from_file) // missing (new_parser_from_source_file) // missing maybe_new_parser_from_source_file source_str_to_stream maybe_source_file_to_stream ``` We could add the two missing functions, but instead this commit removes of all the infallible ones and renames the fallible ones leaving us with these which are all fallible: ``` new_parser_from_source_str new_parser_from_file new_parser_from_source_file source_str_to_stream source_file_to_stream ``` This requires making `unwrap_or_emit_fatal` public so callers of formerly infallible functions can still work. This does make some of the call sites slightly more verbose, but I think it's worth it for the simpler API. Also, there are two `catch_unwind` calls and one `catch_fatal_errors` call in this diff that become removable thanks this change. (I will do that in a follow-up PR.)

Rename buffer_lint_with_diagnostic to buffer_lint

2024-05-21T20:16:39+00:00