| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
|
|
rustc_mir: Hide initial block state when defining transfer functions
This PR addresses [this FIXME](https://github.com/rust-lang/rust/blob/2887008e0ce0824be4e0e9562c22ea397b165c97/src/librustc_mir/dataflow/mod.rs#L594-L596).
This makes `sets.on_entry` inaccessible in `{before_,}{statement,terminator}_effect`. This field was meant to allow implementors of `BitDenotation` to access the initial state for each block (optionally with the effect of all previous statements applied via `accumulates_intrablock_state`) while defining transfer functions. However, the ability to set the initial value for the entry set of each basic block (except for START_BLOCK) no longer exists. As a result, this functionality is mostly useless, and when it *was* used it was used erroneously (see #62007).
Since `on_entry` is now useless, we can also remove `BlockSets`, which held the `gen`, `kill`, and `on_entry` bitvectors and replace it with a `GenKill` struct. Variables of this type are called `trans` since they represent a transfer function. `GenKill`s are stored contiguously in `AllSets`, which reduces the number of bounds checks and may improve cache performance: one is almost never accessed without the other.
Replacing `BlockSets` with `GenKill` allows us to define some new helper functions which streamline dataflow iteration and the dataflow-at-location APIs. Notably, `state_for_location` used a subtle side-effect of the `kill`/`kill_all` setters to apply the transfer function, and could be incorrect if a transfer function depended on effects of previous statements in the block on `gen_set`.
Additionally, this PR merges `BitSetOperator` and `InitialFlow` into one trait. Since the value of `InitialFlow` defines the semantics of the `join` operation, there's no reason to have seperate traits for each. We can add a default impl of `join` which branches based on `BOTTOM_VALUE`. This should get optimized away.
|
|
Since the value of `InitialFlow` defines the semantics of the `join`
operation, there's no reason to have seperate traits for each. We can
add a default impl of `join` which branches based on `BOTTOM_VALUE`.
This should get optimized away.
|
|
librustc_data_structures: Speedup union of sparse and dense hybrid set
This optimization speeds up the union of a hybrid bitset when that
switches it from a sparse representation to a dense bitset. It now
clones the dense bitset and integrate only the spare elements instead of
densifying the sparse bitset, initializing all elements, and then a
union on two dense bitset, touching all words a second time.
It's not completely certain if the added complexity is worth it but I would
like to hear some feedback in any case. Benchmark results from my machine:
```
Now: bit_set::union_hybrid_sparse_to_dense ... bench: 72 ns/iter (+/- 5)
Previous: bit_set::union_hybrid_sparse_to_dense ... bench: 90 ns/iter (+/- 6)
```
This being the second iteration of trying to improve the speed, since I missed the return value in the first, and forgot to run the relevant tests. Oops.
|
|
|
|
Explains the thought process behind adding the union algorithm and
discusses the alternative and heuristic behind.
|
|
This optimization speeds up the union of a hybrid bitset when that
switches it from a sparse representation to a dense bitset. It now
clones the dense bitset and integrate only the spare elements instead of
densifying the sparse bitset, initializing all elements, and then a
union on two dense bitset, touching all words a second time.
|
|
|
|
Closes #60271.
|
|
|
|
|
|
|
|
|
|
Currently, `BitSet` doesn't actually know its own domain size; it just
knows how many words it contains. To improve things, this commit makes
the following changes.
- It changes `BitSet` and `SparseBitSet` to store their own domain size,
and do more precise bounds and same-size checks with it. It also
changes the signature of `BitSet::to_string()` (and puts it within
`impl ToString`) now that the domain size need not be passed in from
outside.
- It uses `derive(RustcDecodable, RustcEncodable)` for `BitSet`. This
required adding code to handle `PhantomData` in `libserialize`.
- As a result, it removes the domain size from `HybridBitSet`, making a
lot of that code nicer.
- Both set_up_to() and clear_above() were overly general, working with
arbitrary sizes when they are only needed for the domain size. The
commit removes the former, degeneralizes the latter, and removes the
(overly general) tests.
- Changes `GrowableBitSet::grow()` to `ensure()`, fixing a bug where a
(1-based) domain size was confused with a (0-based) element index.
- Changes `BitMatrix` to store its row count, and do more precise bounds
checks with it.
- Changes `ty_params` in `select.rs` from a `BitSet` to a
`GrowableBitSet` because it repeatedly failed the new, more precise
bounds checks. (Changing the type was simpler than computing an
accurate domain size.)
- Various other minor improvements.
|
|
This requires adding a few extra methods to `HybridBitSet`. (These are
tested in a new unit test.)
This commit reduces the `max-rss` for `nll-check` builds of `html5ever`
by 46%, `ucd` by 45%, `clap-rs` by 23%, `inflate` by 14%. And the
results for the `unic-ucd-name` crate are even more impressive: a 21%
reduction in instructions, a 60% reduction in wall-time, a 96%
reduction in `max-rss`, and a 97% reduction in faults!
Fixes #52028.
|
|
`SparseBitSet` is the only remaining user of `ArrayVec`. This commit
switches it to using `SmallVec`, and removes `array_vec.rs`.
Why the switch? Although `SparseBitSet` is size-limited and doesn't need
the ability to spill to the heap, `SmallVec` has many more features than
`ArrayVec`. In particular, it's now possible to keep `SparseBitSet`'s
elements in sorted order, which gives in-order iteration, which is a
requirement for the next commit.
|
|
|
|
- Rename `BitSet::data` and `BitMatrix::vector` as `words`, because that's
what they are.
- Remove `BitSet::words_mut()`, which is no longer necessary.
- Better distinguish multiple meanins of "word", i.e. "word index" vs
"word ref" vs "word" (i.e. the value itself).
|
|
`BitwiseOperator` is an unnecessarily low-level thing. This commit
replaces it with `BitSetOperator`, which works on `BitSet`s instead of
words. Within `bit_set.rs`, the commit eliminates `Intersect`, `Union`,
and `Subtract` by instead passing a function to `bitwise()`.
|
|
Currently we have two files implementing bitsets (and 2D bit matrices).
This commit combines them into one, taking the best features from each.
This involves renaming a lot of things. The high level changes are as
follows.
- bitvec.rs --> bit_set.rs
- indexed_set.rs --> (removed)
- BitArray + IdxSet --> BitSet (merged, see below)
- BitVector --> GrowableBitSet
- {,Sparse,Hybrid}IdxSet --> {,Sparse,Hybrid}BitSet
- BitMatrix --> BitMatrix
- SparseBitMatrix --> SparseBitMatrix
The changes within the bitset types themselves are as follows.
```
OLD OLD NEW
BitArray<C> IdxSet<T> BitSet<T>
-------- ------ ------
grow - grow
new - (remove)
new_empty new_empty new_empty
new_filled new_filled new_filled
- to_hybrid to_hybrid
clear clear clear
set_up_to set_up_to set_up_to
clear_above - clear_above
count - count
contains(T) contains(&T) contains(T)
contains_all - superset
is_empty - is_empty
insert(T) add(&T) insert(T)
insert_all - insert_all()
remove(T) remove(&T) remove(T)
words words words
words_mut words_mut words_mut
- overwrite overwrite
merge union union
- subtract subtract
- intersect intersect
iter iter iter
```
In general, when choosing names I went with:
- names that are more obvious (e.g. `BitSet` over `IdxSet`).
- names that are more like the Rust libraries (e.g. `T` over `C`,
`insert` over `add`);
- names that are more set-like (e.g. `union` over `merge`, `superset`
over `contains_all`, `domain_size` over `num_bits`).
Also, using `T` for index arguments seems more sensible than `&T` --
even though the latter is standard in Rust collection types -- because
indices are always copyable. It also results in fewer `&` and `*`
sigils in practice.
|