about summary refs log tree commit diff
path: root/src/librustc_data_structures
AgeCommit message (Collapse)AuthorLines
2018-08-01Rollup merge of #52942 - llogiq:smallvec-opt, r=Mark-SimulacrumPietro Albini-12/+5
Another SmallVec.extend optimization This improves SmallVec.extend even more over #52859 while making the code easier to read. Before ``` test small_vec::tests::fill_small_vec_1_10_with_cap ... bench: 31 ns/iter (+/- 5) test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 70 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 36 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 256 ns/iter (+/- 17) test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 31 ns/iter (+/- 5) test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 26 ns/iter (+/- 1) test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 49 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 219 ns/iter (+/- 11) test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 32 ns/iter (+/- 2) test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 61 ns/iter (+/- 12) test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 37 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 210 ns/iter (+/- 10) ``` After: ``` test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 31 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 39 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 35 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 37 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 32 ns/iter (+/- 2) test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 52 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 46 ns/iter (+/- 0) test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 35 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 31 ns/iter (+/- 0) test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 40 ns/iter (+/- 15) test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 36 ns/iter (+/- 2) ```
2018-08-01Split out growth functionality into BitVector typeMark Rousskov-59/+77
2018-08-01Another SmallVec.extend optimizationAndre Bogus-12/+5
This improves SmallVec.extend even more over #52859 Before (as of #52859): ``` test small_vec::tests::fill_small_vec_1_10_with_cap ... bench: 31 ns/iter (+/- 5) test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 70 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 36 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 256 ns/iter (+/- 17) test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 31 ns/iter (+/- 5) test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 26 ns/iter (+/- 1) test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 49 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 219 ns/iter (+/- 11) test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 32 ns/iter (+/- 2) test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 61 ns/iter (+/- 12) test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 37 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 210 ns/iter (+/- 10) ``` After: ``` test small_vec::tests::fill_small_vec_1_10_wo_cap ... bench: 31 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_1_50_with_cap ... bench: 39 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_1_50_wo_cap ... bench: 35 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_32_10_with_cap ... bench: 37 ns/iter (+/- 3) test small_vec::tests::fill_small_vec_32_10_wo_cap ... bench: 32 ns/iter (+/- 2) test small_vec::tests::fill_small_vec_32_50_with_cap ... bench: 52 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_32_50_wo_cap ... bench: 46 ns/iter (+/- 0) test small_vec::tests::fill_small_vec_8_10_with_cap ... bench: 35 ns/iter (+/- 4) test small_vec::tests::fill_small_vec_8_10_wo_cap ... bench: 31 ns/iter (+/- 0) test small_vec::tests::fill_small_vec_8_50_with_cap ... bench: 40 ns/iter (+/- 15) test small_vec::tests::fill_small_vec_8_50_wo_cap ... bench: 36 ns/iter (+/- 2) ```
2018-08-01Rollup merge of #52859 - ljedrz:smallvec_true_extend, r=Mark-SimulacrumPietro Albini-4/+128
Use Vec::extend in SmallVec::extend when applicable As calculated in #52738, `Vec::extend` is much faster than `push`ing to it in a loop. We can take advantage of this method in `SmallVec` too - at least in cases when its underlying object is an `AccumulateVec::Heap`. ~~This approach also accidentally improves the `push` loop of the `AccumulateVec::Array` variant, because it doesn't utilize `SmallVec::push` which performs `self.reserve(1)` with every iteration; this is unnecessary, because we're already reserving the whole space we will be needing by performing `self.reserve(iter.size_hint().0)` at the beginning.~~
2018-07-31Benchmarks for SmallVecljedrz-0/+116
2018-07-30Use Vec::extend in SmallVec::extend when applicableljedrz-4/+12
2018-07-30Auto merge of #52697 - ljedrz:misc_data_structures, r=Mark-Simulacrumbors-11/+5
Simplify a few functions in rustc_data_structures - drop `try!()` where it's superfluous - change `try!()` to `?` - squash a `push` with `push_str` - refactor a push loop into an iterator
2018-07-29Remove unused `mut`sMatthew Jasper-1/+1
2018-07-27Auto merge of #52336 - ishitatsuyuki:dyn-rollup, r=Mark-Simulacrumbors-2/+0
Rollup of bare_trait_objects PRs All deny attributes were moved into bootstrap so they can be disabled with a line of config. Warnings for external tools are allowed and it's up to the tool's maintainer to keep it warnings free. r? @Mark-Simulacrum cc @ljedrz @kennytm
2018-07-26fix `sparse_matrix_iter` unit testNiko Matsakis-1/+1
2018-07-26add type parameters to `BitMatrix` and `SparseBitMatrix` unit testsNiko Matsakis-3/+3
2018-07-26convert tests of `BitVector` to use `BitVector<usize>`Niko Matsakis-5/+5
2018-07-25SparseBitMatrix: add `insert_all` and `add_all` methodsNiko Matsakis-0/+13
2018-07-25SparseBitMatrix: add `ensure_row` helper fnNiko Matsakis-9/+9
2018-07-25split into two matricesNiko Matsakis-0/+61
2018-07-25parameterize `BitVector` and `BitMatrix` by their index typesNiko Matsakis-46/+63
2018-07-25Deny bare_trait_objects globallyTatsuyuki Ishi-2/+0
2018-07-25implement `Step` for `Idx` typesNiko Matsakis-0/+29
This way, we can iterate over a `Range<T>` where `T: Idx`
2018-07-24Simplify a few functions in rustc_data_structuresljedrz-11/+5
2018-07-22Auto merge of #52250 - nnethercote:no-SparseBitMatrix, r=nikomatsakisbors-208/+28
Speed up `SparseBitMatrix` use in `RegionValues`. In practice, these matrices range from 10% to 90%+ full once they are filled in, so the dense representation is better. This reduces the runtime of Check Nll builds of `inflate` by 32%, and several other benchmarks by 1--5%. It also increases max-rss of `clap-rs` by 30% and a couple of others by up to 5%, while decreasing max-rss of `coercions` by 14%. I think the speed-ups justify the max-rss increases. r? @nikomatsakis
2018-07-20data_structures: Add a reference wrapper for pointer-indexed maps/setsVadim Petrochenkov-10/+56
Use `ptr::eq` for comparing pointers
2018-07-20Speed up `SparseBitMatrix`.Nicholas Nethercote-208/+28
Using a `BTreeMap` to represent rows in the bit matrix is really slow. This patch changes things so that each row is represented by a `BitVector`. This is a less sparse representation, but a much faster one. As a result, `SparseBitSet` and `SparseChunk` can be removed. Other minor changes in this patch. - It renames `BitVector::insert()` as `merge()`, which matches the terminology in the other classes in bitvec.rs. - It removes `SparseBitMatrix::is_subset()`, which is unused. - It reinstates `RegionValueElements::num_elements()`, which #52190 had removed. - It removes a low-value `debug!` call in `SparseBitMatrix::add()`.
2018-07-18Auto merge of #52342 - nnethercote:CanonicalVar, r=nikomatsakisbors-0/+11
Avoid most allocations in `Canonicalizer`. Extra allocations are a significant cost of NLL, and the most common ones come from within `Canonicalizer`. In particular, `canonical_var()` contains this code: indices .entry(kind) .or_insert_with(|| { let cvar1 = variables.push(info); let cvar2 = var_values.push(kind); assert_eq!(cvar1, cvar2); cvar1 }) .clone() `variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used to track what elements have been inserted into `var_values`. If `kind` hasn't been seen before, `indices`, `variables` and `var_values` all get a new element. (The number of elements in each container is always the same.) This results in lots of allocations. In practice, most of the time these containers only end up holding a few elements. This PR changes them to avoid heap allocations in the common case, by changing the `Vec`s to `SmallVec`s and only using `indices` once enough elements are present. (When the number of elements is small, a direct linear search of `var_values` is as good or better than a hashmap lookup.) The changes to `variables` are straightforward and contained within `Canonicalizer`. The changes to `indices` are more complex but also contained within `Canonicalizer`. The changes to `var_values` are more intrusive because they require defining a new type `SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as `SmallVec` is to `Vec -- and passing stack-allocated values of that type in from outside. All this speeds up a number of NLL "check" builds, the best by 2%. r? @nikomatsakis
2018-07-17Auto merge of #52433 - kennytm:rollup, r=kennytmbors-3/+8
Rollup of 9 pull requests Successful merges: - #52286 (Deny bare trait objects in src/librustc_errors) - #52306 (Reduce the number of clone()s needed in obligation_forest) - #52338 (update miri) - #52385 (Pass edition flags to compiler from rustdoc as expected) - #52392 (AsRef doc wording tweaks) - #52430 (update nomicon) - #52434 (Enable incremental independent of stage) - #52435 (Calculate the exact capacity for 2 HashMaps) - #52446 (Block beta if clippy breaks.) r? @ghost
2018-07-17Auto merge of #52190 - davidtwco:issue-52028, r=nikomatsakisbors-9/+66
html5ever in the rustc-perf repository is memory-intensive Part of #52028. Rebased atop of #51987. r? @nikomatsakis
2018-07-17Rollup merge of #52306 - ljedrz:obligation_forest_clone, r=varkorkennytm-3/+8
Reduce the number of clone()s needed in obligation_forest Some can be avoided by using `remove_entry` instead of `remove`.
2018-07-17Auto merge of #52335 - nnethercote:BitSlice-fixes, r=nikomatsakisbors-17/+11
`BitSlice` fixes `propagate_bits_into_entry_set_for` and `BitSlice::bitwise` are hot for some benchmarks under NLL. I tried and failed to speed them up. (Increasing the size of `bit_slice::Word` from `usize` to `u128` caused a slowdown, even though decreasing the size of `bitvec::Word` from `u128` to `u64` also caused a slowdown. Weird.) Anyway, along the way I fixed up several problems in and around the `BitSlice` code. r? @nikomatsakis
2018-07-16Generate region values directly to reduce memory usage.David Wood-9/+66
Also modify `SparseBitMatrix` so that it does not require knowing the dimensions in advance, but instead grows on demand.
2018-07-17Avoid most allocations in `Canonicalizer`.Nicholas Nethercote-0/+11
Extra allocations are a significant cost of NLL, and the most common ones come from within `Canonicalizer`. In particular, `canonical_var()` contains this code: indices .entry(kind) .or_insert_with(|| { let cvar1 = variables.push(info); let cvar2 = var_values.push(kind); assert_eq!(cvar1, cvar2); cvar1 }) .clone() `variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used to track what elements have been inserted into `var_values`. If `kind` hasn't been seen before, `indices`, `variables` and `var_values` all get a new element. (The number of elements in each container is always the same.) This results in lots of allocations. In practice, most of the time these containers only end up holding a few elements. This PR changes them to avoid heap allocations in the common case, by changing the `Vec`s to `SmallVec`s and only using `indices` once enough elements are present. (When the number of elements is small, a direct linear search of `var_values` is as good or better than a hashmap lookup.) The changes to `variables` are straightforward and contained within `Canonicalizer`. The changes to `indices` are more complex but also contained within `Canonicalizer`. The changes to `var_values` are more intrusive because they require defining a new type `SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as `SmallVec` is to `Vec -- and passing stack-allocated values of that type in from outside. All this speeds up a number of NLL "check" builds, the best by 2%.
2018-07-14Reduce the number of clone()s needed in obligation_forestljedrz-3/+8
Some can be avoided by using remove_entry instead of remove.
2018-07-13Auto merge of #51987 - nikomatsakis:nll-region-infer-scc, r=pnkfelixbors-481/+1089
nll experiment: compute SCCs instead of iterative region solving This is an attempt to speed up region solving by replacing the current iterative dataflow with a SCC computation. The idea is to detect cycles (SCCs) amongst region constraints and then compute just one value per cycle. The graph with all cycles removed is of course a DAG, so we can then solve constraints "bottom up" once the liveness values are known. I kinda ran out of time this morning so the last commit is a bit sloppy but I wanted to get this posted, let travis run on it, and maybe do a perf run, before I clean it up.
2018-07-13nit: fix `all_sccs` commentNiko Matsakis-1/+1
2018-07-13nit: tweak comment orderNiko Matsakis-21/+23
2018-07-13nit: improve SCC commentsNiko Matsakis-4/+19
2018-07-13nit: clarify "keep it around" commentNiko Matsakis-2/+2
2018-07-13nit: s/successor/successors/Niko Matsakis-2/+2
2018-07-13compute region values using SCCs not iterative flowNiko Matsakis-0/+5
The strategy is this: - we compute SCCs once all outlives constraints are known - we allocate a set of values **per region** for storing liveness - we allocate a set of values **per SCC** for storing the final values - when we add a liveness constraint to the region R, we also add it to the final value of the SCC to which R belongs - then we can apply the constraints by just walking the DAG for the SCCs and union'ing the children (which have their liveness constraints within) There are a few intermediate refactorings that I really ought to have broken out into their own commits: - reverse the constraint graph so that `R1: R2` means `R1 -> R2` and not `R2 -> R1`. This fits better with the SCC computation and new style of inference (`->` now means "take value from" and not "push value into") - this does affect some of the UI tests, since they traverse the graph, but mostly the artificial ones and they don't necessarily seem worse - put some things (constraint set, etc) into `Rc`. This lets us root them to permit mutation and iteration. It also guarantees they don't change, which is critical to the correctness of the algorithm. - Generalize various helpers that previously operated only on points to work on any sort of region element.
2018-07-13Fix bitslice printing.Nicholas Nethercote-11/+5
In multiple ways: - Two calls to `bits_to_string()` passed in byte lengths rather than bit lengths, which meant only 1/8th of the `BitSlice` was printed. - `bit_str`'s purpose is entirely mysterious. I removed it and changed its callers to print the indices in the obvious way. - `bits_to_string`'s inner loop was totally wrong, such that it printed entirely bogus results. - `bits_to_string` now also adds a '|' between words, which makes the output easier to read, e.g.: `[ff-ff-ff-ff-ff-ff-ff-ff|ff-ff-ff-ff-ff-ff-ff-07]`.
2018-07-13Make BitSlice's `Word` properly generic.Nicholas Nethercote-7/+7
Currently `Word` is `usize`, and there are various places in the code that assume this. This patch mostly just changes `usize` occurrences to `Word`. Most of the changes were found as compile errors when I changed `Word` to a type other than `usize`, but there was one non-obvious case in librustc_mir/dataflow/mod.rs that caused bounds check failures before I fixed it.
2018-07-12introduce a generic SCC computationNiko Matsakis-3/+531
2018-07-12strengthen `Idx` to require `Ord + Hash`Niko Matsakis-1/+2
You should always be able to know that any `T` where `T: Idx` can be used in a `BTreeMap` and a `FxHashMap`.
2018-07-12rename `control_flow_graph` to `graph`Niko Matsakis-1/+1
2018-07-12rename `graph` to `control_flow_graph::implementation`Niko Matsakis-1/+1
2018-07-12deconstruct the `ControlFlowGraph` trait into more granular traitsNiko Matsakis-60/+117
2018-07-11add a missing `dyn`ljedrz-1/+1
2018-07-11Enforce #![deny(bare_trait_objects)] in src/librustc_data_structures testsljedrz-14/+14
2018-07-11Deny bare trait objects in in src/librustc_data_structuresljedrz-13/+15
2018-07-02improve commentsNiko Matsakis-0/+6
2018-07-01create a new `WorkQueue` data structureNiko Matsakis-0/+73
2018-06-29Rename `IdxSet::clone_from`.Nicholas Nethercote-1/+3
The current situation is something of a mess. - `IdxSetBuf` derefs to `IdxSet`. - `IdxSetBuf` implements `Clone`, and therefore has a provided `clone_from` method, which does allocation and so is expensive. - `IdxSet` has a `clone_from` method that is non-allocating and therefore cheap, but this method is not from the `Clone` trait. As a result, if you have an `IdxSetBuf` called `b`, if you call `b.clone_from(b2)` you'll get the expensive `IdxSetBuf` method, but if you call `(*b).clone_from(b2)` you'll get the cheap `IdxSetBuf` method. `liveness_of_locals()` does the former, presumably unintentionally, and therefore does lots of unnecessary allocations. Having a `clone_from` method that isn't from the `Clone` trait is a bad idea in general, so this patch renames it as `overwrite`. This avoids the unnecessary allocations in `liveness_of_locals()`, speeding up most NLL benchmarks, the best by 1.5%. It also means that calls of the form `(*b).clone_from(b2)` can be rewritten as `b.overwrite(b2)`.