| Age | Commit message (Collapse) | Author | Lines |
|
use ? to simplify `TransitiveRelation.maybe_map`
I think this looks much clearer than the original.
|
|
|
|
|
|
Avoid many allocations for CStrings during codegen.
Giving in to my irrational fear of dynamic allocations. Let's see what perf says to this.
|
|
[nll] enable feature(nll) on various crates for bootstrap: part 4
#53172
r? @nikomatsakis
|
|
A few cleanups for rustc_data_structures
- remove a redundant `clone()`
- make some calls to `.iter()` implicit
- collapse/simplify a few operations
- remove some explicit `return`s
- make `SnapshotMap::{commit, rollback_to}` take references
- remove unnecessary struct field names
- change `transmute()`s in `IdxSet::{from_slice, from_slice_mut}` to casts
- remove some unnecessary lifetime annotations
- split 2 long literals
|
|
Don't collect() when size_hint is useless
This adjusts PRs #52738 and #52697 by falling back to calculating capacity and extending or pushing in a loop where `collect()` can't be trusted to calculate the right capacity.
It is a performance win.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Add errors for unknown, stable and duplicate feature attributes
- Adds an error for unknown (lang and lib) features.
- Extends the lint for unnecessary feature attributes for stable features to libs features (this already exists for lang features).
- Adds an error for duplicate (lang and lib) features.
```rust
#![feature(fake_feature)] //~ ERROR unknown feature `fake_feature`
#![feature(i128_type)] //~ WARNING the feature `i128_type` has been stable since 1.26.0
#![feature(non_exhaustive)]
#![feature(non_exhaustive)] //~ ERROR duplicate `non_exhaustive` feature attribute
```
Fixes #52053, fixes #53032 and address some of the problems noted in #44232 (though not unused features).
There are a few outstanding problems, that I haven't narrowed down yet:
- [x] Stability attributes on macros do not seem to be taken into account.
- [x] Stability attributes behind `cfg` attributes are not taken into account.
- [x] There are failing incremental tests.
|
|
|
|
|
|
also add benchmarks
Before:
```
test tiny_list::test::bench_insert_empty ... bench: 1 ns/iter (+/- 0)
test tiny_list::test::bench_insert_one ... bench: 16 ns/iter (+/- 0)
test tiny_list::test::bench_remove_empty ... bench: 2 ns/iter (+/- 0)
test tiny_list::test::bench_remove_one ... bench: 6 ns/iter (+/- 0)
test tiny_list::test::bench_remove_unknown ... bench: 4 ns/iter (+/- 0)
```
After:
```
test tiny_list::test::bench_insert_empty ... bench: 1 ns/iter (+/- 0)
test tiny_list::test::bench_insert_one ... bench: 16 ns/iter (+/- 0)
test tiny_list::test::bench_remove_empty ... bench: 0 ns/iter (+/- 0)
test tiny_list::test::bench_remove_one ... bench: 3 ns/iter (+/- 0)
test tiny_list::test::bench_remove_unknown ... bench: 2 ns/iter (+/- 0)
```
|
|
Another SmallVec.extend optimization
This improves SmallVec.extend even more over #52859 while making the code easier to read.
Before
```
test small_vec::tests::fill_small_vec_1_10_with_cap ... bench: 31 ns/iter (+/- 5)
test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 70 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 36 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 256 ns/iter (+/- 17)
test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 31 ns/iter (+/- 5)
test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 26 ns/iter (+/- 1)
test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 49 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 219 ns/iter (+/- 11)
test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 32 ns/iter (+/- 2)
test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 61 ns/iter (+/- 12)
test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 37 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 210 ns/iter (+/- 10)
```
After:
```
test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 31 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 39 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 35 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 37 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 32 ns/iter (+/- 2)
test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 52 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 46 ns/iter (+/- 0)
test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 35 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 31 ns/iter (+/- 0)
test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 40 ns/iter (+/- 15)
test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 36 ns/iter (+/- 2)
```
|
|
|
|
This improves SmallVec.extend even more over #52859
Before (as of #52859):
```
test small_vec::tests::fill_small_vec_1_10_with_cap ... bench: 31 ns/iter (+/- 5)
test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 70 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 36 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 256 ns/iter (+/- 17)
test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 31 ns/iter (+/- 5)
test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 26 ns/iter (+/- 1)
test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 49 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 219 ns/iter (+/- 11)
test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 32 ns/iter (+/- 2)
test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 61 ns/iter (+/- 12)
test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 37 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 210 ns/iter (+/- 10)
```
After:
```
test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 31 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 39 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 35 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 37 ns/iter (+/- 3)
test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 32 ns/iter (+/- 2)
test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 52 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 46 ns/iter (+/- 0)
test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 35 ns/iter (+/- 4)
test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 31 ns/iter (+/- 0)
test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 40 ns/iter (+/- 15)
test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 36 ns/iter (+/- 2)
```
|
|
Use Vec::extend in SmallVec::extend when applicable
As calculated in #52738, `Vec::extend` is much faster than `push`ing to it in a loop. We can take advantage of this method in `SmallVec` too - at least in cases when its underlying object is an `AccumulateVec::Heap`.
~~This approach also accidentally improves the `push` loop of the `AccumulateVec::Array` variant, because it doesn't utilize `SmallVec::push` which performs `self.reserve(1)` with every iteration; this is unnecessary, because we're already reserving the whole space we will be needing by performing `self.reserve(iter.size_hint().0)` at the beginning.~~
|
|
|
|
|
|
Simplify a few functions in rustc_data_structures
- drop `try!()` where it's superfluous
- change `try!()` to `?`
- squash a `push` with `push_str`
- refactor a push loop into an iterator
|
|
|
|
Rollup of bare_trait_objects PRs
All deny attributes were moved into bootstrap so they can be disabled with a line of config.
Warnings for external tools are allowed and it's up to the tool's maintainer to keep it warnings free.
r? @Mark-Simulacrum
cc @ljedrz @kennytm
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This way, we can iterate over a `Range<T>` where `T: Idx`
|
|
|
|
Speed up `SparseBitMatrix` use in `RegionValues`.
In practice, these matrices range from 10% to 90%+ full once they are
filled in, so the dense representation is better.
This reduces the runtime of Check Nll builds of `inflate` by 32%, and
several other benchmarks by 1--5%.
It also increases max-rss of `clap-rs` by 30% and a couple of others by
up to 5%, while decreasing max-rss of `coercions` by 14%. I think the
speed-ups justify the max-rss increases.
r? @nikomatsakis
|
|
Use `ptr::eq` for comparing pointers
|
|
Using a `BTreeMap` to represent rows in the bit matrix is really slow.
This patch changes things so that each row is represented by a
`BitVector`. This is a less sparse representation, but a much faster
one.
As a result, `SparseBitSet` and `SparseChunk` can be removed.
Other minor changes in this patch.
- It renames `BitVector::insert()` as `merge()`, which matches the
terminology in the other classes in bitvec.rs.
- It removes `SparseBitMatrix::is_subset()`, which is unused.
- It reinstates `RegionValueElements::num_elements()`, which #52190 had
removed.
- It removes a low-value `debug!` call in `SparseBitMatrix::add()`.
|
|
Avoid most allocations in `Canonicalizer`.
Extra allocations are a significant cost of NLL, and the most common
ones come from within `Canonicalizer`. In particular, `canonical_var()`
contains this code:
indices
.entry(kind)
.or_insert_with(|| {
let cvar1 = variables.push(info);
let cvar2 = var_values.push(kind);
assert_eq!(cvar1, cvar2);
cvar1
})
.clone()
`variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used
to track what elements have been inserted into `var_values`. If `kind`
hasn't been seen before, `indices`, `variables` and `var_values` all get
a new element. (The number of elements in each container is always the
same.) This results in lots of allocations.
In practice, most of the time these containers only end up holding a few
elements. This PR changes them to avoid heap allocations in the common
case, by changing the `Vec`s to `SmallVec`s and only using `indices`
once enough elements are present. (When the number of elements is small,
a direct linear search of `var_values` is as good or better than a
hashmap lookup.)
The changes to `variables` are straightforward and contained within
`Canonicalizer`. The changes to `indices` are more complex but also
contained within `Canonicalizer`. The changes to `var_values` are more
intrusive because they require defining a new type
`SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as
`SmallVec` is to `Vec -- and passing stack-allocated values of that type
in from outside.
All this speeds up a number of NLL "check" builds, the best by 2%.
r? @nikomatsakis
|
|
Rollup of 9 pull requests
Successful merges:
- #52286 (Deny bare trait objects in src/librustc_errors)
- #52306 (Reduce the number of clone()s needed in obligation_forest)
- #52338 (update miri)
- #52385 (Pass edition flags to compiler from rustdoc as expected)
- #52392 (AsRef doc wording tweaks)
- #52430 (update nomicon)
- #52434 (Enable incremental independent of stage)
- #52435 (Calculate the exact capacity for 2 HashMaps)
- #52446 (Block beta if clippy breaks.)
r? @ghost
|
|
html5ever in the rustc-perf repository is memory-intensive
Part of #52028. Rebased atop of #51987.
r? @nikomatsakis
|
|
Reduce the number of clone()s needed in obligation_forest
Some can be avoided by using `remove_entry` instead of `remove`.
|
|
`BitSlice` fixes
`propagate_bits_into_entry_set_for` and `BitSlice::bitwise` are hot for some benchmarks under NLL. I tried and failed to speed them up. (Increasing the size of `bit_slice::Word` from `usize` to `u128` caused a slowdown, even though decreasing the size of `bitvec::Word` from `u128` to `u64` also caused a slowdown. Weird.)
Anyway, along the way I fixed up several problems in and around the `BitSlice` code.
r? @nikomatsakis
|
|
Also modify `SparseBitMatrix` so that it does not require knowing the
dimensions in advance, but instead grows on demand.
|
|
Extra allocations are a significant cost of NLL, and the most common
ones come from within `Canonicalizer`. In particular, `canonical_var()`
contains this code:
indices
.entry(kind)
.or_insert_with(|| {
let cvar1 = variables.push(info);
let cvar2 = var_values.push(kind);
assert_eq!(cvar1, cvar2);
cvar1
})
.clone()
`variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used
to track what elements have been inserted into `var_values`. If `kind`
hasn't been seen before, `indices`, `variables` and `var_values` all get
a new element. (The number of elements in each container is always the
same.) This results in lots of allocations.
In practice, most of the time these containers only end up holding a few
elements. This PR changes them to avoid heap allocations in the common
case, by changing the `Vec`s to `SmallVec`s and only using `indices`
once enough elements are present. (When the number of elements is small,
a direct linear search of `var_values` is as good or better than a
hashmap lookup.)
The changes to `variables` are straightforward and contained within
`Canonicalizer`. The changes to `indices` are more complex but also
contained within `Canonicalizer`. The changes to `var_values` are more
intrusive because they require defining a new type
`SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as
`SmallVec` is to `Vec -- and passing stack-allocated values of that type
in from outside.
All this speeds up a number of NLL "check" builds, the best by 2%.
|