| Age | Commit message (Collapse) | Author | Lines |
|
|
|
Use a `ParamEnvAnd<Predicate>` for caching in `ObligationForest`
Previously, we used a plain `Predicate` to cache results (e.g. successes
and failures) in ObligationForest. However, fulfillment depends on the
precise `ParamEnv` used, so this is unsound in general.
This commit changes the impl of `ForestObligation` for
`PendingPredicateObligation` to use `ParamEnvAnd<Predicate>` instead of
`Predicate` for the associated type. The associated type and method are
renamed from 'predicate' to 'cache_key' to reflect the fact that type is
no longer just a predicate.
|
|
Construct query job latches on-demand
r? @michaelwoerister
|
|
|
|
Speed up `SipHasher128`.
The current code in `SipHasher128::short_write` is inefficient. It uses
`u8to64_le` (which is complex and slow) to extract just the right number of
bytes of the input into a u64 and pad the result with zeroes. It then
left-shifts that value in order to bitwise-OR it with `self.tail`.
For example, imagine we have a u32 input `0xIIHH_GGFF` and only need three bytes
to fill up `self.tail`. The current code uses `u8to64_le` to construct
`0x0000_0000_00HH_GGFF`, which is just `0xIIHH_GGFF` with the `0xII` removed and
zero-extended to a u64. The code then left-shifts that value by five bytes --
discarding the `0x00` byte that replaced the `0xII` byte! -- to give
`0xHHGG_FF00_0000_0000`. It then then ORs that value with `self.tail`.
There's a much simpler way to do it: zero-extend to u64 first, then left shift.
E.g. `0xIIHH_GGFF` is zero-extended to `0x0000_0000_IIHH_GGFF`, and then
left-shifted to `0xHHGG_FF00_0000_0000`. We don't have to take time to exclude
the unneeded `0xII` byte, because it just gets shifted out anyway! It also avoids
multiple occurrences of `unsafe`.
There's a similar story with the setting of `self.tail` at the method's end.
The current code uses `u8to64_le` to extract the remaining part of the input,
but the same effect can be achieved more quickly with a right shift on the
zero-extended input.
This commit changes `SipHasher128` to use the simpler shift-based approach. The
code is also smaller, which means that `short_write` is now inlined where
previously it wasn't, which makes things faster again. This gives big
speed-ups for all incremental builds, especially "baseline" incremental
builds.
r? @michaelwoerister
|
|
|
|
This makes it faster and also changes it to a safe function. (Thanks to
Michael Woerister for the suggestion.) `load_int_le!` is also no longer
necessary.
|
|
|
|
Remove unused feature gates
I think many of the remaining unstable things can be easily be replaced with stable things. I have kept the `#![feature(nll)]` even though it is only necessary in `libstd`, to make regressions of it harder.
|
|
|
|
The current code in `SipHasher128::short_write` is inefficient. It uses
`u8to64_le` (which is complex and slow) to extract just the right number of
bytes of the input into a u64 and pad the result with zeroes. It then
left-shifts that value in order to bitwise-OR it with `self.tail`.
For example, imagine we have a u32 input 0xIIHH_GGFF and only need three bytes
to fill up `self.tail`. The current code uses `u8to64_le` to construct
0x0000_0000_00HH_GGFF, which is just 0xIIHH_GGFF with the 0xII removed and
zero-extended to a u64. The code then left-shifts that value by five bytes --
discarding the 0x00 byte that replaced the 0xII byte! -- to give
0xHHGG_FF00_0000_0000. It then then ORs that value with self.tail.
There's a much simpler way to do it: zero-extend to u64 first, then left shift.
E.g. 0xIIHH_GGFF is zero-extended to 0x0000_0000_IIHH_GGFF, and then
left-shifted to 0xHHGG_FF00_0000_0000. We don't have to take time to exclude
the unneeded 0xII byte, because it just gets shifted out anyway! It also avoids
multiple occurrences of `unsafe`.
There's a similar story with the setting of `self.tail` at the method's end.
The current code uses `u8to64_le` to extract the remaining part of the input,
but the same effect can be achieved more quickly with a right shift on the
zero-extended input.
All that works on little-endian. It doesn't work for big-endian, but we
can just do a `to_le` before calling `short_write` and then it works.
This commit changes `SipHasher128` to use the simpler shift-based approach. The
code is also smaller, which means that `short_write` is now inlined where
previously it wasn't, which makes things faster again. This gives big
speed-ups for all incremental builds, especially "baseline" incremental
builds.
|
|
Generator Resume Arguments
cc https://github.com/rust-lang/rust/issues/43122 and https://github.com/rust-lang/rust/issues/56974
Blockers:
* [x] Fix miscompilation when resume argument is live across a yield point (https://github.com/rust-lang/rust/pull/68524#issuecomment-578459069)
* [x] Fix 10% compile time regression in `await-call-tree` benchmarks (https://github.com/rust-lang/rust/pull/68524#issuecomment-578487162)
* [x] Fix remaining 1-3% regression (https://github.com/rust-lang/rust/pull/68524#issuecomment-579566255) - resolved (https://github.com/rust-lang/rust/pull/68524#issuecomment-581144901)
* [x] Make dropck rules account for resume arguments (https://github.com/rust-lang/rust/pull/68524#issuecomment-578541137)
Follow-up work:
* Change async/await desugaring to make use of this feature
* Rewrite [`box_region.rs`](https://github.com/rust-lang/rust/blob/3d8778d767f0dde6fe2bc9459f21ead8e124d8cb/src/librustc_data_structures/box_region.rs) to use resume arguments (this shows up in profiles too)
|
|
It's not needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Previously, we used a plain `Predicate` to cache results (e.g. successes
and failures) in ObligationForest. However, fulfillment depends on the
precise `ParamEnv` used, so this is unsound in general.
This commit changes the impl of `ForestObligation` for
`PendingPredicateObligation` to use `ParamEnvAnd<Predicate>` instead of
`Predicate` for the associated type. The associated type and method are
renamed from 'predicate' to 'cache_key' to reflect the fact that type is
no longer just a predicate.
|
|
[self-profiler] Add example to `-Z help` to turn on query key recording
Also add the `default` option so that it's easy to add query key
recording to the default.
r? @michaelwoerister
|
|
Implement Lift using interners instead of in_arena
r? @eddyb
cc @cjgillot
|
|
Also add the `default` option so that it's easy to add query key
recording to the default.
|
|
use winapi for non-stdlib Windows bindings
|
|
Galloping search for binary_search_util
This is unlikely to improve perf much unless for synthetic benchmarks, but I figure it likely won't hurt either.
|
|
|
|
|
|
|
|
Add some missing timers
Based on https://github.com/rust-lang/rust/pull/67988
r? @wesleywiser
|
|
|
|
platforms.
|
|
|
|
|
|
|
|
query-invocation-specific event_ids.
|
|
rustc_ast_lowering: misc cleanup & rustc dep reductions
- The first two commits do some code simplification.
- The next three do some file splitting (getting `lib.rs` below the 3kloc tidy lint).
- The remaining commits reduce the number of `rustc::` imports. This works towards making lowering independent of the `rustc` crate.
r? @oli-obk cc @Zoxc
|
|
|
|
|
|
|
|
|
|
remove bespoke flock bindings
Replaces some `struct flock` definitions with the definition from `libc`.
|
|
|
|
|
|
|
|
Because it caused major performance regressions in some cases.
That PR had five commits, two of which affected performance, and three
of which were refactorings. This change undoes the performance-affecting
changes, while keeping the refactorings in place.
Fixes #67454.
|
|
Set callbacks globally
This sets the callbacks from syntax and rustc_errors just once, utilizing static (rather than thread-local) storage.
|
|
|
|
The callbacks have precisely two states: the default, and the one
present throughout almost all of the rustc run (the filled in value
which has access to TyCtxt).
We used to store this as a thread local, and reset it on each thread to
the non-default value. But this is somewhat wasteful, since there is no
reason to set it globally -- while the callbacks themselves access TLS,
they do not do so in a manner that fails in when we do not have TLS to
work with.
|
|
|
|
This reverts commit 3ed3b8bb7b100afecf7d5f52eafbb70fec27f537, reversing
changes made to 99b89533d4cdf7682ea4054ad0ee36c351d05df1.
We will reland a similar patch at a future date but for now we should get a nightly
released in a few hours with the parallel patch, so this should be
reverted to make sure that the next nightly is not parallel-enabled.
|