| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
Add new self-profiling event to cheaply aggregate query cache hit counts
Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`).
We'd like to add query cache hits to rustc-perf (https://github.com/rust-lang/rustc-perf/pull/2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it.
This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case.
With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled.
We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc.
CC `@cjgillot`
r? `@oli-obk`
|
|
|
|
|
|
|
|
|
|
Signed-off-by: onur-ozkan <work@onurozkan.dev>
|
|
At [1] it was pointed out that `cfg_match!` syntax does not actually
align well with match syntax, which is a possible source of confusion.
The comment points out that usage is instead more similar to ecosystem
`select!` macros. Rename `cfg_match!` to `cfg_select!` to match this.
Tracking issue: https://github.com/rust-lang/rust/issues/115585
[1]: https://github.com/rust-lang/rust/issues/115585#issuecomment-2346307605
|
|
Use `std::mem::{size_of, size_of_val, align_of, align_of_val}` from the
prelude instead of importing or qualifying them.
These functions were added to all preludes in Rust 1.80.
|
|
|
|
|
|
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
|
|
|
|
`use` is a nicer way of doing things.
|
|
|
|
This involves lots of breaking changes. There are two big changes that
force changes. The first is that the bitflag types now don't
automatically implement normal derive traits, so we need to derive them
manually.
Additionally, bitflags now have a hidden inner type by default, which
breaks our custom derives. The bitflags docs recommend using the impl
form in these cases, which I did.
|
|
|
|
|
|
|
|
|
|
Spelling compiler
This is per https://github.com/rust-lang/rust/pull/110392#issuecomment-1510193656
I'm going to delay performing a squash because I really don't expect people to be perfectly happy w/ my changes, I really am a human and I really do make mistakes.
r? Nilstrieb
I'm going to be flying this evening, but I should be able to squash / respond to reviews w/in a day or two.
I tried to be careful about dropping changes to `tests`, afaict only two files had changes that were likely related to the changes for a given commit (this is where not having eagerly squashed should have given me an advantage), but, that said, picking things apart can be error prone.
|
|
* account
* achieved
* advising
* always
* ambiguous
* analysis
* annotations
* appropriate
* build
* candidates
* cascading
* category
* character
* clarification
* compound
* conceptually
* constituent
* consts
* convenience
* corresponds
* debruijn
* debug
* debugable
* debuggable
* deterministic
* discriminant
* display
* documentation
* doesn't
* ellipsis
* erroneous
* evaluability
* evaluate
* evaluation
* explicitly
* fallible
* fulfill
* getting
* has
* highlighting
* illustrative
* imported
* incompatible
* infringing
* initialized
* into
* intrinsic
* introduced
* javascript
* liveness
* metadata
* monomorphization
* nonexistent
* nontrivial
* obligation
* obligations
* offset
* opaque
* opportunities
* opt-in
* outlive
* overlapping
* paragraph
* parentheses
* poisson
* precisely
* predecessors
* predicates
* preexisting
* propagated
* really
* reentrant
* referent
* responsibility
* rustonomicon
* shortcircuit
* simplifiable
* simplifications
* specify
* stabilized
* structurally
* suggestibility
* translatable
* transmuting
* two
* unclosed
* uninhabited
* visibility
* volatile
* workaround
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
|
|
This avoids `rustc_data_structures` depending on `serde_json` which
allows it to be compiled much earlier, unlocking most of rustc.
|
|
Co-authored-by: Michael Goulet <michael@errs.io>
|
|
|
|
|
|
time-passes`
|
|
|
|
|
|
|
|
|
|
Convert all the crates that have had their diagnostic migration
completed (except save_analysis because that will be deleted soon and
apfloat because of the licensing problem).
|
|
|
|
|
|
|
|
`print_time_passes_entry` unconditionally prints data about a pass. The
most commonly used call site, in `VerboseTimingGuard::drop`, guards it
with a `should_print_passes` test. But there are a couple of other call
sites that don't do that test.
This commit moves the `should_print_passes` test within
`print_time_passes_entry` so that all passes are treated equally.
|
|
The compiler currently has `-Ztime` and `-Ztime-passes`. I've used
`-Ztime-passes` for years but only recently learned about `-Ztime`.
What's the difference? Let's look at the `-Zhelp` output:
```
-Z time=val -- measure time of rustc processes (default: no)
-Z time-passes=val -- measure time of each rustc pass (default: no)
```
The `-Ztime-passes` description is clear, but the `-Ztime` one is less so.
Sounds like it measures the time for the entire process?
No. The real difference is that `-Ztime-passes` prints out info about passes,
and `-Ztime` does the same, but only for a subset of those passes. More
specifically, there is a distinction in the profiling code between a "verbose
generic activity" and an "extra verbose generic activity". `-Ztime-passes`
prints both kinds, while `-Ztime` only prints the first one. (It took me
a close reading of the source code to determine this difference.)
In practice this distinction has low value. Perhaps in the past the "extra
verbose" output was more voluminous, but now that we only print stats for a
pass if it exceeds 5ms or alters the RSS, `-Ztime-passes` is less spammy. Also,
a lot of the "extra verbose" cases are for individual lint passes, and you need
to also use `-Zno-interleave-lints` to see those anyway.
Therefore, this commit removes `-Ztime` and the associated machinery. One thing
to note is that the existing "extra verbose" activities all have an extra
string argument, so the commit adds the ability to accept an extra argument to
the "verbose" activities.
|
|
- It's `--print`, not `--prints`.
- `-Ztime` and `-Ztime-passes` print to stderr, not stdout.
|
|
|
|
|
|
|
|
Fixing #95444 by only displaying passes that take more than 5 millise…
As discussed in #95444, I have added the code to test and only display prints that are greater than 5 milliseconds.
r? `@jyn514`
|
|
95444: Adding passes that include memory increase
Fix95444: Change the substraction with the abs_diff() method
Fix95444: Change the substraction with abs_diff() method
|
|
This allows profiling costly arguments to be recorded only when `-Zself-profile-events=args` is on: using a closure that takes an `EventArgRecorder` and call its `record_arg` or `record_args` methods.
|
|
|
|
|
|
|
|
|
|
|