| Age | Commit message (Collapse) | Author | Lines |
|
Some can be avoided by using remove_entry instead of remove.
|
|
nll experiment: compute SCCs instead of iterative region solving
This is an attempt to speed up region solving by replacing the current iterative dataflow with a SCC computation. The idea is to detect cycles (SCCs) amongst region constraints and then compute just one value per cycle. The graph with all cycles removed is of course a DAG, so we can then solve constraints "bottom up" once the liveness values are known.
I kinda ran out of time this morning so the last commit is a bit sloppy but I wanted to get this posted, let travis run on it, and maybe do a perf run, before I clean it up.
|
|
|
|
|
|
|
|
|
|
|
|
The strategy is this:
- we compute SCCs once all outlives constraints are known
- we allocate a set of values **per region** for storing liveness
- we allocate a set of values **per SCC** for storing the final values
- when we add a liveness constraint to the region R, we also add it
to the final value of the SCC to which R belongs
- then we can apply the constraints by just walking the DAG for the
SCCs and union'ing the children (which have their liveness
constraints within)
There are a few intermediate refactorings that I really ought to have
broken out into their own commits:
- reverse the constraint graph so that `R1: R2` means `R1 -> R2` and
not `R2 -> R1`. This fits better with the SCC computation and new
style of inference (`->` now means "take value from" and not "push
value into")
- this does affect some of the UI tests, since they traverse the
graph, but mostly the artificial ones and they don't necessarily
seem worse
- put some things (constraint set, etc) into `Rc`. This lets us root
them to permit mutation and iteration. It also guarantees they don't
change, which is critical to the correctness of the algorithm.
- Generalize various helpers that previously operated only on points
to work on any sort of region element.
|
|
In multiple ways:
- Two calls to `bits_to_string()` passed in byte lengths rather than bit
lengths, which meant only 1/8th of the `BitSlice` was printed.
- `bit_str`'s purpose is entirely mysterious. I removed it and changed
its callers to print the indices in the obvious way.
- `bits_to_string`'s inner loop was totally wrong, such that it printed
entirely bogus results.
- `bits_to_string` now also adds a '|' between words, which makes the
output easier to read, e.g.:
`[ff-ff-ff-ff-ff-ff-ff-ff|ff-ff-ff-ff-ff-ff-ff-07]`.
|
|
Currently `Word` is `usize`, and there are various places in the code
that assume this.
This patch mostly just changes `usize` occurrences to `Word`. Most of
the changes were found as compile errors when I changed `Word` to a type
other than `usize`, but there was one non-obvious case in
librustc_mir/dataflow/mod.rs that caused bounds check failures before I
fixed it.
|
|
|
|
You should always be able to know that any `T` where `T: Idx`
can be used in a `BTreeMap` and a `FxHashMap`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The current situation is something of a mess.
- `IdxSetBuf` derefs to `IdxSet`.
- `IdxSetBuf` implements `Clone`, and therefore has a provided `clone_from`
method, which does allocation and so is expensive.
- `IdxSet` has a `clone_from` method that is non-allocating and therefore
cheap, but this method is not from the `Clone` trait.
As a result, if you have an `IdxSetBuf` called `b`, if you call
`b.clone_from(b2)` you'll get the expensive `IdxSetBuf` method, but if you call
`(*b).clone_from(b2)` you'll get the cheap `IdxSetBuf` method.
`liveness_of_locals()` does the former, presumably unintentionally, and
therefore does lots of unnecessary allocations.
Having a `clone_from` method that isn't from the `Clone` trait is a bad idea in
general, so this patch renames it as `overwrite`. This avoids the unnecessary
allocations in `liveness_of_locals()`, speeding up most NLL benchmarks, the
best by 1.5%. It also means that calls of the form `(*b).clone_from(b2)` can be
rewritten as `b.overwrite(b2)`.
|
|
Obligation forest cleanup
While looking at this code I was scratching my head about whether a node could appear in both `parent` and `dependents`. Turns out it can, but it's not useful to do so, so this PR cleans things up so it's no longer possible.
|
|
|
|
Improve memoization and refactor NLL type check
I have a big branch that is refactoring NLL type check with the goal of introducing canonicalization-based memoization for all of the operations it does. This PR contains an initial prefix of that branch which, I believe, stands alone. It does introduce a few smaller optimizations of its own:
- Skip operations that are trivially a no-op
- Cache the results of the dropck-outlives computations done by liveness
- Skip resetting unifications if nothing changed
r? @pnkfelix
|
|
This patch:
- Reorders things a bit so that `parent` is always handled before
`dependents`.
- Uses iterator chaining to avoid some code duplication.
|
|
This patch makes it impossible for a node to end up in both
`node.parent` and `node.dependents`.
|
|
Speed up obligation forest code
Here are the rustc-perf benchmarks that get at least a 1% speedup on one or more of their runs with these patches applied:
```
inflate-check
avg: -8.7% min: -12.1% max: 0.0%
inflate
avg: -5.9% min: -8.6% max: 1.1%
inflate-opt
avg: -1.5% min: -2.0% max: -0.3%
clap-rs-check
avg: -0.6% min: -1.9% max: 0.5%
coercions
avg: -0.2%? min: -1.3%? max: 0.6%?
serde-opt
avg: -0.6% min: -1.0% max: 0.1%
coercions-check
avg: -0.4%? min: -1.0%? max: -0.0%?
```
|
|
We used to accumulate a vector of type-check constraints, but now we
generate them as we go, and just store the NLL-style
`OutlivesConstraint` and `TypeTest` information.
|
|
Avoid useless Vec clones in pending_obligations().
The only instance of `ObligationForest` in use has an obligation type of
`PendingPredicateObligation`, which contains a `PredicateObligation` and a
`Vec<Ty>`.
`FulfillmentContext::pending_obligations()` calls
`ObligationForest::pending_obligations()`, which clones all the
`PendingPredicateObligation`s. But the `Vec<Ty>` field of those cloned
obligations is never touched.
This patch changes `ObligationForest::pending_obligations()` to
`map_pending_obligations` -- which gives callers control about which part
of the obligation to clone -- and takes advantage of the change to avoid
cloning the `Vec<Ty>`. The change speeds up runs of a few rustc-perf
benchmarks, the best by 1%.
|
|
The only instance of `ObligationForest` in use has an obligation type of
`PendingPredicateObligation`, which contains a `PredicateObligation` and a
`Vec<Ty>`.
`FulfillmentContext::pending_obligations()` calls
`ObligationForest::pending_obligations()`, which clones all the
`PendingPredicateObligation`s. But the `Vec<Ty>` field of those cloned
obligations is never touched.
This patch changes `ObligationForest::pending_obligations()` to
`map_pending_obligations` -- which gives callers control about which part
of the obligation to clone -- and takes advantage of the change to avoid
cloning the `Vec<Ty>`. The change speeds up runs of a few rustc-perf
benchmarks, the best by 1%.
|
|
A tri-valued enum is nicer than Result<Option<T>>, and it's slightly
faster.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Because they are small and hot.
|
|
It's never used in a meaningful way.
|
|
|
|
stabilize RangeBounds collections_range #30877
The FCP for #30877 closed last month, with the decision to:
1. move from `collections::range::RangeArgument` to `ops::RangeBounds`, and
2. rename `start()` and `end()` to `start_bounds()` and `end_bounds()`.
Simon Sapin already moved it to `ops::RangeBounds` in #49163.
I renamed the functions, and removed the old `collections::range::RangeArgument` alias.
This is my first Rust PR, please let me know if I can improve anything. This passes all tests for me, except the `clippy` tool (which uses `RangeArgument::start()`).
I considered deprecating `start()` and `end()` instead of removing them, but the contribution guidelines indicate we can break `clippy` temporarily. I thought it was best to remove the functions, since we're worried about name collisions with `Range::start` and `end`.
Closes #30877.
|
|
implement the chalk-engine traits
Preliminary implementation for the Chalk traits in rustc. Lots of `panic!()` placeholders to be filled in later.
This is currently blocked on us landing https://github.com/rust-lang-nursery/chalk/pull/131 in chalk and issuing a new release, which should occur later today.
r? @scalexm
cc @leodasvacas
|
|
|
|
rename RangeBounds::start() -> start_bound()
rename RangeBounds::end() -> end_bound()
|
|
|
|
|
|
|
|
order which lookup relies on.
|
|
|