| Age | Commit message (Collapse) | Author | Lines |
|
lexer: Fix span override for the first token in a string
Previously due to peculiarities of `StringReader` construction something like `"a b c d".parse::<TokenStream>()` gave you one non-overridden span for `a` and then three correctly overridden spans for `b`, `c` and `d`.
Now all the spans are overridden.
|
|
rustc: introduce {ast,hir}::AnonConst to consolidate so-called "embedded constants".
Previously, constants in array lengths and enum variant discriminants were "merely an expression", and had no separate ID for, e.g. type-checking or const-eval, instead reusing the expression's.
That complicated code working with bodies, because such constants were the only special case where the "owner" of the body wasn't the HIR parent, but rather the same node as the body itself.
Also, if the body happened to be a closure, we had no way to allocate a `DefId` for both the constant *and* the closure, leading to *several* bugs (mostly ICEs where type errors were expected).
This PR rectifies the situation by adding another (`{ast,hir}::AnonConst`) node around every such constant. Also, const generics are expected to rely on the new `AnonConst` nodes, as well (cc @varkor).
* fixes #48838
* fixes #50600
* fixes #50688
* fixes #50689
* obsoletes #50623
r? @nikomatsakis
|
|
Consider this a down payment on #50723. To recap, an `Applicability`
enum was recently (#50204) added, to convey to Rustfix and other tools
whether we think it's OK for them to blindly apply the suggestion, or
whether to prompt a human for guidance (because the suggestion might
contain placeholders that we can't infer, or because we think it has a
sufficiently high probability of being wrong even though it's—
presumably—right often enough to be worth emitting in the first place).
When a suggestion is marked as `MaybeIncorrect`, we try to use comments
to indicate precisely why (although there are a few places where we just
say `// speculative` because the present author's subjective judgement
balked at the idea that the suggestion has no false positives).
The `run-rustfix` directive is opporunistically set on some relevant UI
tests (and a couple tests that were in the `test/ui/suggestions`
directory, even if the suggestions didn't originate in librustc or
libsyntax). This is less trivial than it sounds, because a surprising
number of test files aren't equipped to be tested as fixed even when
they contain successfully fixable errors, because, e.g., there are more,
not-directly-related errors after fixing. Some test files need an
attribute or underscore to avoid unused warnings tripping up the "fixed
code is still producing diagnostics" check despite the fixes being
correct; this is an interesting contrast-to/inconsistency-with the
behavior of UI tests (which secretly pass `-A unused`), a behavior which
we probably ought to resolve one way or the other (filed issue #50926).
A few suggestion labels are reworded (e.g., to avoid phrasing it as a
question, which which is discouraged by the style guidelines listed in
`.span_suggestion`'s doc-comment).
|
|
|
|
|
|
Speed up the macro parser
These three commits reduce the number of allocations done by the macro parser, in some cases dramatically. For example, for a clean check builds of html5ever, the number of allocations is reduced by 40%.
Here are the rustc-benchmarks that are sped up by at least 1%.
```
html5ever-check
avg: -6.6% min: -10.3% max: -4.1%
html5ever
avg: -5.2% min: -9.5% max: -2.8%
html5ever-opt
avg: -4.3% min: -9.3% max: -1.6%
crates.io-check
avg: -1.8% min: -2.9% max: -0.6%
crates.io-opt
avg: -1.0% min: -2.2% max: -0.1%
crates.io
avg: -1.1% min: -2.2% max: -0.2%
```
|
|
constants".
|
|
This commit fixes `StringReader`'s parsing of tokens which have been stringified
through procedural macros. Whether or not a token tree is joint is defined by
span information, but when working with procedural macros these spans are often
dummy and/or overridden which means that they end up considering all operators
joint if they can!
The fix here is to track the raw source span as opposed to the overridden span.
With this information we can more accurately classify `Punct` structs as either
joint or not.
Closes #50700
|
|
Because we create a lot of these in the macro parser, but only very
rarely modify them.
This speeds up some html5ever runs by 2--3%.
|
|
Implement edition hygiene for keywords
Determine "keywordness" of an identifier in its hygienic context.
cc https://github.com/rust-lang/rust/pull/49611
I've resurrected `proc` as an Edition-2015-only keyword for testing purposes, but it should probably be buried again. EDIT: `proc` is removed again.
|
|
Streamline `StringReader::bump`
These patches make `bump` smaller and nicer. They speed up most runs for coercions and tuple-stress by 1--3%.
|
|
|
|
|
|
Implement label break value (RFC 2046)
Implement label-break-value (#48594).
|
|
|
|
|
|
|
|
It only has a single use, within code handling indented block comments.
We can replace that with the new `FileMap::col_pos()`, which computes
the col position (BytePos instead of CharPos) based on the record of the
last newline char (which we already record).
This is actually an improvement, because
`trim_whitespace_prefix_and_push_line()` was using `col`, which is a
`CharPos`, as a slice index, which is a byte/char confusion.
|
|
|
|
Implements RFC 1576.
See: https://github.com/rust-lang/rfcs/blob/master/text/1576-macros-literal-matcher.md
Changes are mostly in libsyntax, docs, and tests. Feature gate is
enabled for 1.27.0.
Many thanks to Vadim Petrochenkov for following through code reviews
and suggestions.
Example:
````rust
macro_rules! test_literal {
($l:literal) => {
println!("literal: {}", $l);
};
($e:expr) => {
println!("expr: {}", $e);
};
}
fn main() {
let a = 1;
test_literal!(a);
test_literal!(2);
test_literal!(-3);
}
```
Output:
```
expr: 1
literal: 2
literal: -3
```
|
|
It's silly for a hot function like `bump()` to have such an expensive
bounds check. This patch replaces terminator with `end_src_index`.
Note that the `self.terminator` check in `is_eof()` wasn't necessary
because of the way `StringReader` is initialized.
|
|
- `source_text` becomes `src`, matching `FileMap::src`.
- `byte_offset()` becomes `src_index()`, which makes it clearer that
it's an index into `src`. (Likewise for variables containing
`byte_offset` in their name.) This function also now returns a `usize`
instead of a `BytePos`, because every callsite immediately converted
the `BytePos` to a `usize`.
|
|
This patch removes the "old"/"new" names in favour of "foo"/"next_foo",
which matches the field names.
It also moves the setting of `self.{ch,pos,next_pos}` in the common case
to the end, so that the meaning of "foo"/"next_foo" is consistent until
the end.
|
|
|
|
In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.
|
|
Rollup of 12 pull requests
Successful merges:
- #50302 (Add query search order check)
- #50320 (Fix invalid path generation in rustdoc search)
- #50349 (Rename "show type declaration" to "show declaration")
- #50360 (Clarify wordings of the `unstable_name_collision` lint.)
- #50365 (Use two vectors in nearest_common_ancestor.)
- #50393 (Allow unaligned reads in constants)
- #50401 (Revert "Implement FromStr for PathBuf")
- #50406 (Forbid constructing empty identifiers from concat_idents)
- #50407 (Always inline simple BytePos and CharPos methods.)
- #50416 (check if the token is a lifetime before parsing)
- #50417 (Update Cargo)
- #50421 (Fix ICE when using a..=b in a closure.)
Failed merges:
|
|
Implement tool_attributes feature (RFC 2103)
cc #44690
This is currently just a rebased and compiling (hopefully) version of #47773.
Let's see if travis likes this. I will add the implementation for `tool_lints` this week.
|
|
|
|
str::escape_default() can be used instead.
|
|
|
|
|
|
|
|
'label can start expressions
```Rust
let foo = 'label: loop { break 'label 42; };
```
is valid Rust code.
|
|
|
|
let foo = 'label: loop { break 'label 42; };
is valid Rust code.
|
|
Doc comments present after a particular syntax error cause an unhelpful error message to be output.
fixed: #48636
r? @estebank
|
|
2) Changed position of help message, incase comma is missing
3) added few missing spaces and handled span_suggestion for vscode
4) updated stderr file
|
|
|
|
Avoid allocating when parsing \u{...} literals.
`char_lit` uses an allocation in order to ignore '_' chars in \u{...}
literals. This patch changes it to not do that by processing the chars
more directly.
This improves various rustc-perf benchmark measurements by up to 6%,
particularly regex, futures, clap, coercions, hyper, and encoding.
rustc-perf results, on a stage 2 build with jemalloc disabled:
<details>
```
regex-check
avg: -5.4% min: -6.5% max: -2.7%
futures-check
avg: -3.5% min: -5.3% max: -1.7%
regex-opt
avg: -2.0% min: -5.1% max: -0.2%
regex
avg: -2.3% min: -5.0% max: -0.6%
futures-opt
avg: -3.0% min: -4.8% max: -1.1%
futures
avg: -3.1% min: -4.8% max: -1.3%
clap-rs-check
avg: -1.8% min: -3.5% max: -0.9%
coercions-check
avg: -2.0% min: -3.3% max: -1.0%
hyper-check
avg: -2.2% min: -3.1% max: -1.3%
hyper
avg: -1.3% min: -2.4% max: -0.3%
hyper-opt
avg: -0.9% min: -2.3% max: -0.1%
coercions
avg: -1.1% min: -2.2% max: -0.4%
encoding-check
avg: -1.7% min: -2.2% max: -0.9%
clap-rs-opt
avg: -0.7% min: -2.2% max: 0.0%
coercions-opt
avg: -1.2% min: -2.1% max: -0.3%
clap-rs
avg: -0.8% min: -1.9% max: -0.4%
encoding-opt
avg: -1.0% min: -1.9% max: -0.3%
encoding
avg: -1.1% min: -1.9% max: -0.4%
piston-image-check
avg: -0.7% min: -1.3% max: -0.3%
inflate-opt
avg: -0.3% min: -0.9% max: -0.0%
piston-image
avg: -0.3% min: -0.8% max: -0.1%
piston-image-opt
avg: -0.3% min: -0.7% max: -0.1%
syn-check
avg: -0.3% min: -0.6% max: -0.1%
deep-vector
avg: 0.1% min: -0.1% max: 0.5%
syn-opt
avg: -0.1% min: -0.4% max: 0.0%
html5ever
avg: -0.2% min: -0.4% max: -0.0%
deep-vector-check
avg: 0.0% min: -0.3% max: 0.3%
syn
avg: -0.2% min: -0.3% max: -0.1%
html5ever-check
avg: -0.3% min: -0.3% max: -0.2%
issue-46449-check
avg: -0.1% min: -0.2% max: 0.2%
html5ever-opt
avg: -0.0% min: -0.2% max: 0.1%
deep-vector-opt
avg: -0.0% min: -0.2% max: 0.1%
issue-46449-opt
avg: -0.0% min: -0.2% max: 0.1%
unify-linearly-check
avg: -0.0% min: -0.2% max: 0.1%
helloworld-check
avg: 0.0% min: -0.0% max: 0.2%
parser-check
avg: -0.0% min: -0.2% max: 0.0%
inflate
avg: 0.0% min: -0.0% max: 0.1%
tokio-webpush-simple-check
avg: -0.1% min: -0.1% max: -0.0%
regression-31157-check
avg: 0.0% min: -0.1% max: 0.1%
issue-46449
avg: 0.0% min: -0.1% max: 0.1%
tuple-stress-opt
avg: 0.0% min: -0.0% max: 0.1%
tuple-stress-check
avg: -0.0% min: -0.1% max: 0.1%
tuple-stress
avg: 0.0% min: -0.0% max: 0.1%
deeply-nested-check
avg: 0.0% min: -0.0% max: 0.1%
regression-31157
avg: -0.0% min: -0.1% max: 0.1%
deeply-nested-opt
avg: -0.0% min: -0.1% max: 0.1%
parser-opt
avg: -0.0% min: -0.1% max: 0.0%
parser
avg: 0.1% min: 0.0% max: 0.1%
tokio-webpush-simple
avg: -0.0% min: -0.1% max: 0.1%
regression-31157-opt
avg: -0.0% min: -0.1% max: 0.1%
helloworld-opt
avg: 0.0% min: -0.0% max: 0.1%
unify-linearly-opt
avg: 0.0% min: -0.0% max: 0.1%
unused-warnings-check
avg: 0.0% min: 0.0% max: 0.1%
tokio-webpush-simple-opt
avg: -0.0% min: -0.1% max: 0.0%
helloworld
avg: -0.0% min: -0.0% max: 0.1%
unused-warnings
avg: 0.0% min: -0.0% max: 0.0%
deeply-nested
avg: -0.0% min: -0.0% max: -0.0%
unused-warnings-opt
avg: 0.0% min: -0.0% max: 0.0%
unify-linearly
avg: 0.0% min: -0.0% max: 0.0%
inflate-check
avg: 0.0% min: -0.0% max: 0.0%
```
</details>
|
|
Discovered in #50061 we're falling off the "happy path" of using a stringified
token stream more often than we should. This was due to the fact that a
user-written token like `0xf` is equality-different from the stringified token
of `15` (despite being semantically equivalent).
This patch updates the call to `eq_unspanned` with an even more awful solution,
`probably_equal_for_proc_macro`, which ignores the value of each token and
basically only compares the structure of the token stream, assuming that the AST
doesn't change just one token at a time.
While this is a step towards fixing #50061 there is still one regression
from #49154 which needs to be fixed.
|
|
`char_lit` uses an allocation in order to ignore '_' chars in \u{...}
literals. This patch changes it to not do that by processing the chars
more directly.
This improves various rustc-perf benchmark measurements by up to 6%,
particularly regex, futures, clap, coercions, hyper, and encoding.
|
|
Change the hashcounts in raw `Lit` variants from usize to u16.
This reduces the size of `Token` from 32 bytes to 24 bytes on 64-bit
platforms.
|
|
proc_macro: Avoid cached TokenStream more often
This commit adds even more pessimization to use the cached `TokenStream` inside
of an AST node. As a reminder the `proc_macro` API requires taking an arbitrary
AST node and transforming it back into a `TokenStream` to hand off to a
procedural macro. Such functionality isn't actually implemented in rustc today,
so the way `proc_macro` works today is that it stringifies an AST node and then
reparses for a list of tokens.
This strategy unfortunately loses all span information, so we try to avoid it
whenever possible. Implemented in #43230 some AST nodes have a `TokenStream`
cache representing the tokens they were originally parsed from. This
`TokenStream` cache, however, has turned out to not always reflect the current
state of the item when it's being tokenized. For example `#[cfg]` processing or
macro expansion could modify the state of an item. Consequently we've seen a
number of bugs (#48644 and #49846) related to using this stale cache.
This commit tweaks the usage of the cached `TokenStream` to compare it to our
lossy stringification of the token stream. If the tokens that make up the cache
and the stringified token stream are the same then we return the cached version
(which has correct span information). If they differ, however, then we will
return the stringified version as the cache has been invalidated and we just
haven't figured that out.
Closes #48644
Closes #49846
|
|
|
|
Resolve them into field indices once and then use those resolutions
+ Fix rebase
|
|
|
|
This reduces the size of `Token` from 32 bytes to 24 bytes on 64-bit
platforms.
|
|
Merge the std_unicode crate into the core crate
[The standard library facade](https://github.com/rust-lang/rust/issues/27783) has historically contained a number of crates with different roles, but that number has decreased over time. `rand` and `libc` have moved to crates.io, and [`collections` was merged into `alloc`](https://github.com/rust-lang/rust/pull/42648). Today we have `core` that applies everywhere, `std` that expects a full operating system, and `alloc` in-between that only requires a memory allocator (which can be provided by users)… and `std_unicode`, which doesn’t really have a reason to be separate anymore. It contains functionality based on Unicode data tables that can be large, but as long as relevant functions are not called the tables should be removed from binaries by linkers.
This deprecates the unstable `std_unicode` crate and moves all of its contents into `core`, replacing them with `pub use` reexports. The crate can be removed later. This also removes the `CharExt` trait (replaced with inherent methods in libcore) and `UnicodeStr` trait (merged into `StrExt`). There traits were both unstable and not intended to be used or named directly.
A number of new items are newly-available in libcore and instantly stable there, but only if they were already stable in libstd.
Fixes #49319.
|
|
|
|
Use sort_by_cached_key where appropriate
A follow-up to https://github.com/rust-lang/rust/pull/48639, converting various slice sorting calls to `sort_by_cached_key` when the key functions are more expensive.
|