| Age | Commit message (Collapse) | Author | Lines |
|
Reject invalid literal suffixes in tuple indexing, tuple struct indexing, and struct field name position
Tracking issue: rust-lang/rust#60210
Closes rust-lang/rust#60210
## Summary
Bump the ["suffixes on a tuple index are invalid" non-lint pseudo future-incompatibility warning (#60210)][issue-60210][^non-lint] to a **hard error** across all editions, rejecting the remaining carve outs from accidentally accepted invalid suffixes since Rust **1.27**.
- We accidentally accepted invalid suffixes in tuple indexing positions in Rust **1.27**. Originally reported at https://github.com/rust-lang/rust/issues/59418.
- We tried to hard reject all invalid suffixes in https://github.com/rust-lang/rust/pull/59421, but unfortunately it turns out there were proc macros accidentally relying on it: https://github.com/rust-lang/rust/issues/60138.
- We temporarily accepted `{i,u}{32,size}` in https://github.com/rust-lang/rust/pull/60186 (the "*carve outs*") to mitigate *immediate* ecosystem impact, but it came with an FCW warning indicating that we wanted to reject it after a few Rust releases.
- Now (1.89.0) is a few Rust releases later (1.35.0), thus I'm proposing to **also reject the carve outs**.
- `std::mem::offset_of!` stabilized in Rust **1.77.0** happens to use the same "don't expect suffix" code path which has the carve outs, so it also accepted the carve out suffixes. I'm proposing to **reject this case as well**.
## What specifically breaks?
Code that still relied on invalid `{i,u}{32,size}` suffixes being temporarily accepted by rust-lang/rust#60186 as an ecosystem impact mitigation measure (cf. rust-lang/rust#60138). Specifically, the following cases (particularly the construction of these forms in proc macros like reported in rust-lang/rust#60138):
### Position 1: Invalid `{i,u}{32,size}` suffixes in tuple indexing
```rs
fn main() {
let _x = (42,).0invalid; // Already error, already rejected by #59421
let _x = (42,).0i8; // Already error, not one of the #60186 carve outs.
let _x = (42,).0usize; // warning: suffixes on a tuple index are invalid
}
```
### Position 2: Invalid `{i,u}{32,size}` suffixes in tuple struct indexing
```rs
fn main() {
struct X(i32);
let _x = X(42);
let _x = _x.0invalid; // Already error, already rejected by #59421
let _x = _x.0i8; // Already error, not one of the #60186 carve outs.
let _x = _x.0usize; // warning: suffixes on a tuple index are invalid
}
```
### Position 3: Invalid `{i,u}{32,size}` suffixes in numeric struct field names
```rs
fn main() {
struct X(i32, i32, i32);
let _x = X(1, 2, 3);
let _y = X { 0usize: 42, 1: 42, 2: 42 }; // warning: suffixes on a tuple index are invalid
match _x {
X { 0usize: 1, 1: 2, 2: 3 } => todo!(), // warning: suffixes on a tuple index are invalid
_ => {}
}
}
```
### Position 4: Invalid `{i,u}{32,size}` suffixes in `std::mem::offset_of!`
While investigating the warning, unfortunately I noticed `std::mem::offset_of!` also happens to use the "expect no suffix" code path which had the carve outs. So this was accepted since Rust **1.77.0** with the same FCW:
```rs
fn main() {
#[repr(C)]
pub struct Struct<T>(u8, T);
assert_eq!(std::mem::offset_of!(Struct<u32>, 0usize), 0);
//~^ WARN suffixes on a tuple index are invalid
}
```
### The above forms in proc macros
For instance, constructions like (see tracking issue rust-lang/rust#60210):
```rs
let i = 0;
quote! { foo.$i }
```
where the user needs to actually write
```rs
let i = syn::Index::from(0);
quote! { foo.$i }
```
### Crater results
Conducted a crater run (https://github.com/rust-lang/rust/pull/145463#issuecomment-3194920383).
- https://github.com/AmlingPalantir/r4/tree/256af3c72f094b298cd442097ef7c571d8001f29: genuine regression; "invalid suffix `usize`" in derive macro. Has a ton of other build warnings, last updated 6 years ago.
- Exactly the kind of intended breakage. Minimized down to https://github.com/AmlingPalantir/r4/blob/256af3c72f094b298cd442097ef7c571d8001f29/validates_derive/src/lib.rs#L71-L75, where when interpolation uses `quote`'s `ToTokens` on a `usize` index (i.e. on tuple struct `Tup(())`), the generated suffix becomes `.0usize` (cf. Position 2).
- Notified crate author of breakage in https://github.com/AmlingPalantir/r4/issues/1.
- Other failures are unrelated or spurious.
## Review remarks
- Commits 1-3 expands the test coverage to better reflect the current situation before doing any functional changes.
- Commit 4 is an intentional **breaking change**. We bump the non-lint "suffixes on a tuple index are invalid" warning into a hard error. Thus, this will need a crater run and a T-lang FCP.
## Tasks
- [x] Run crater to check if anyone is still relying on this being not a hard error. Determine degree of ecosystem breakage.
- [x] If degree of breakage seems acceptable, draft nomination report for T-lang for FCP.
- [x] Determine hard error on Edition 2024+, or on all editions.
## Accompanying Reference update
- https://github.com/rust-lang/reference/pull/1966
[^non-lint]: The FCW was implemented as a *non-lint* warning (meaning it has no associated lint name, and you can't `#![deny(..)]` it) because spans coming from proc macros could not be distinguished from regular field access. This warning was also intentionally impossible to silence. See https://github.com/rust-lang/rust/pull/60186#issuecomment-485581694.
[issue-60210]: https://github.com/rust-lang/rust/issues/60210
|
|
|
|
`TokenKind` now impls `Copy`, so we can store it directly rather than a
reference.
|
|
|
|
r=petrochenkov
Sometimes skip over tokens in `parse_token_tree`.
r? `@petrochenkov`
|
|
|
|
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
|
|
This sometimes avoids a lot of `bump` calls.
|
|
|
|
make `cfg_select` a builtin macro
tracking issue: https://github.com/rust-lang/rust/issues/115585
This parses mostly the same as the `macro cfg_select` version, except:
1. wrapping in double brackets is no longer supported (or needed): `cfg_select {{ /* ... */ }}` is now rejected.
2. in an expression context, the rhs is no longer wrapped in a block, so that this now works:
```rust
fn main() {
println!(cfg_select! {
unix => { "foo" }
_ => { "bar" }
});
}
```
3. a single wildcard rule is now supported: `cfg_select { _ => 1 }` now works
I've also added an error if none of the rules evaluate to true, and warnings for any arms that follow the `_` wildcard rule.
cc `@traviscross` if I'm missing any feature that should/should not be included
r? `@petrochenkov` for the macro logic details
|
|
|
|
|
|
|
|
|
|
Parser: Document restrictions
I had trouble easily understanding what these various flags do. This is my attempt to try to explain what these do.
|
|
It's no longer necessary after the removal of nonterminal tokens in #124141.
|
|
I had trouble easily understanding what these various flags do. This is
my attempt to try to explain what these do.
|
|
Specifically: `TokenCursor`, `TokenTreeCursor`,
`LazyAttrTokenStreamImpl`, `FlatToken`, `make_attr_token_stream`,
`ParserRange`, `NodeRange`. `ParserReplacement`, and `NodeReplacement`.
These are all related to token streams, rather than actual parsing.
This will facilitate the simplifications in the next commit.
|
|
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
|
|
By replacing them with `{Open,Close}{Param,Brace,Bracket,Invisible}`.
PR #137902 made `ast::TokenKind` more like `lexer::TokenKind` by
replacing the compound `BinOp{,Eq}(BinOpToken)` variants with fieldless
variants `Plus`, `Minus`, `Star`, etc. This commit does a similar thing
with delimiters. It also makes `ast::TokenKind` more similar to
`parser::TokenType`.
This requires a few new methods:
- `TokenKind::is_{,open_,close_}delim()` replace various kinds of
pattern matches.
- `Delimiter::as_{open,close}_token_kind` are used to convert
`Delimiter` values to `TokenKind`.
Despite these additions, it's a net reduction in lines of code. This is
because e.g. `token::OpenParen` is so much shorter than
`token::OpenDelim(Delimiter::Parenthesis)` that many multi-line forms
reduce to single line forms. And many places where the number of lines
doesn't change are still easier to read, just because the names are
shorter, e.g.:
```
- } else if self.token != token::CloseDelim(Delimiter::Brace) {
+ } else if self.token != token::CloseBrace {
```
|
|
nnethercote:rm-Nonterminal-and-TokenKind-Interpolated, r=petrochenkov
Remove `Nonterminal` and `TokenKind::Interpolated`
A third attempt at this; the first attempt was #96724 and the second was #114647.
r? `@ghost`
|
|
Fixes #139445.
The additional errors aren't great but the first one is still good and
it's the most important, and imperfect errors are better than ICEing.
|
|
This can happen when invalid syntax is passed to a declarative macro. We
shouldn't be too strict about the token stream position once the parser
has rejected the invalid syntax.
Fixes #139248.
|
|
Fixes #137874.
Removes `tests/crashes/137874.rs`; the new test is simpler (defines its
own macro) but tests the same thing.
The changes to the output of `tests/ui/associated-consts/issue-93835.rs`
partly undo the changes seen when `NtTy` was removed in #133436, which
is good.
|
|
|
|
`NtBlock` is the last remaining variant of `Nonterminal`, so once it is
gone then `Nonterminal` can be removed as well.
|
|
In favour of the similar method on `Parser`, which works on things
other than identifiers and lifetimes.
|
|
Notes about tests:
- tests/ui/rfcs/rfc-2294-if-let-guard/feature-gate.rs: some messages are
now duplicated due to repeated parsing.
- tests/ui/rfcs/rfc-2497-if-let-chains/disallowed-positions.rs: ditto.
- `tests/ui/proc-macro/macro-rules-derive-cfg.rs`: the diff looks large
but the only difference is the insertion of a single
invisible-delimited group around a metavar.
- `tests/ui/attributes/nonterminal-expansion.rs`: a slight span
degradation, somehow related to the recent massive attr parsing
rewrite (#135726). I couldn't work out exactly what is going wrong,
but I don't think it's worth holding things up for a single slightly
suboptimal error message.
|
|
|
|
moving it to the header.
|
|
Remove `NtItem` and `NtStmt`
Another piece of #124141.
r? `@petrochenkov`
|
|
This involves replacing `nt_pretty_printing_compatibility_hack` with
`stream_pretty_printing_compatibility_hack`.
The handling of statements in `transcribe` is slightly different to
other nonterminal kinds, due to the lack of `from_ast` implementation
for empty statements.
Notable test changes:
- `tests/ui/proc-macro/expand-to-derive.rs`: the diff looks large but
the only difference is the insertion of a single invisible-delimited
group around a metavar.
|
|
|
|
`BinOpToken` is badly named, because it only covers the assignable
binary ops and excludes comparisons and `&&`/`||`. Its use in
`ast::TokenKind` does allow a small amount of code sharing, but it's a
clumsy factoring.
This commit removes `ast::TokenKind::BinOp{,Eq}`, replacing each one
with 10 individual variants. This makes `ast::TokenKind` more similar to
`rustc_lexer::TokenKind`, which has individual variants for all
operators.
Although the number of lines of code increases, the number of chars
decreases due to the frequent use of shorter names like `token::Plus`
instead of `token::BinOp(BinOpToken::Plus)`.
|
|
|
|
Note: there was an existing code path involving `Interpolated` in
`MetaItem::from_tokens` that was dead. This commit transfers that to the
new form, but puts an `unreachable!` call inside it.
|
|
The one notable test change is `tests/ui/macros/trace_faulty_macros.rs`.
This commit removes the complicated `Interpolated` handling in
`expected_expression_found` that results in a longer error message. But
I think the new, shorter message is actually an improvement.
The original complaint was in #71039, when the error message started
with "error: expected expression, found `1 + 1`". That was confusing
because `1 + 1` is an expression. Other than that, the reporter said
"the whole error message is not too bad if you ignore the first line".
Subsequently, extra complexity and wording was added to the error
message. But I don't think the extra wording actually helps all that
much. In particular, it still says of the `1+1` that "this is expected
to be expression". This repeats the problem from the original complaint!
This commit removes the extra complexity, reverting to a simpler error
message. This is primarily because the traversal is a pain without
`Interpolated` tokens. Nonetheless, I think the error message is
*improved*. It now starts with "expected expression, found `pat`
metavariable", which is much clearer and the real problem. It also
doesn't say anything specific about `1+1`, which is good, because the
`1+1` isn't really relevant to the error -- it's the `$e:pat` that's
important.
|
|
|
|
|
|
Notes about tests:
- tests/ui/parser/macro/trait-object-macro-matcher.rs: the syntax error
is duplicated, because it occurs now when parsing the decl macro
input, and also when parsing the expanded decl macro. But this won't
show up for normal users due to error de-duplication.
- tests/ui/associated-consts/issue-93835.rs: similar, plus there are
some additional errors about this very broken code.
- The changes to metavariable descriptions in #132629 are now visible in
error message for several tests.
|
|
We now use invisible delimiters for expanded `vis` fragments, instead of
`Token::Interpolated`.
|
|
Because of an ambiguity with const closures, the parser needs to ensure
that for a const item, the `const` keyword isn't followed by a `move` or
`static` keyword, as that would indicate a const closure:
```rust
fn main() {
const move // ...
}
```
This check did not take raw identifiers into account, therefore being
unable to distinguish between `const move` and `const r#move`. The
latter is obviously not a const closure, so it should be allowed as a
const item.
This fixes the check in the parser to only treat `const ...` as a const
closure if it's followed by the *proper keyword*, and not a raw
identifier.
Additionally, this adds a large test that tests for all raw identifiers in
all kinds of positions, including `const`, to prevent issues like this
one from occurring again.
|
|
|
|
simplify `similar_tokens` from `Option<Vec<_>>` to `&[_]`
All uses immediately invoke contains, so maybe a further simplification is possible.
|
|
formatting traits
|
|
|
|
The size of this struct depends on the alignment of `u128`, for example
powerpc64le and s390x have align-8 and end up with only 280 bytes. Our
64-bit tier-1 arches are the same though, so let's just assert on those.
|
|
|
|
For some reason the memory layout is different on s390x.
|
|
The parser pushes a `TokenType` to `Parser::expected_token_types` on
every call to the various `check`/`eat` methods, and clears it on every
call to `bump`. Some of those `TokenType` values are full tokens that
require cloning and dropping. This is a *lot* of work for something
that is only used in error messages and it accounts for a significant
fraction of parsing execution time.
This commit overhauls `TokenType` so that `Parser::expected_token_types`
can be implemented as a bitset. This requires changing `TokenType` to a
C-style parameterless enum, and adding `TokenTypeSet` which uses a
`u128` for the bits. (The new `TokenType` has 105 variants.)
The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to
the `check`/`eat` methods. This is for maximum speed. The elements in
the pairs are always statically known; e.g. a
`token::BinOp(token::Star)` is always paired with a `TokenType::Star`.
So we now compute `TokenType`s in advance and pass them in to
`check`/`eat` rather than the current approach of constructing them on
insertion into `expected_token_types`.
Values of these pair types can be produced by the new `exp!` macro,
which is used at every `check`/`eat` call site. The macro is for
convenience, allowing any pair to be generated from a single identifier.
The ident/keyword filtering in `expected_one_of_not_found` is no longer
necessary. It was there to account for some sloppiness in
`TokenKind`/`TokenType` comparisons.
The existing `TokenType` is moved to a new file `token_type.rs`, and all
its new infrastructure is added to that file. There is more boilerplate
code than I would like, but I can't see how to make it shorter.
|