about summary refs log tree commit diff
path: root/compiler/rustc_monomorphize/src/partitioning.rs
AgeCommit message (Collapse)AuthorLines
2024-04-17Use non-exhaustive matches for TyKindDaria Sukhonina-1/+2
Also no longer export noop async_drop_in_place_raw
2024-04-16Add simple async drop glue generationzetanumbers-2/+5
Explainer: https://zetanumbers.github.io/book/async-drop-design.html https://github.com/rust-lang/rust/pull/121801
2024-03-20collector: recursively traverse 'mentioned' items to evaluate their constantsRalf Jung-7/+7
2024-03-14Rollup merge of #122287 - RalfJung:simd-static-assert, r=pnkfelixMatthias Krüger-0/+3
add test ensuring simd codegen checks don't run when a static assertion failed stdarch relies on this to ensure that SIMD indices are in bounds. I would love to know why this works, but I can't figure out where codegen decides to not codegen a function if a required-const does not evaluate. `@oli-obk` `@bjorn3` do you have any idea?
2024-03-13coverage: Remove all unstable values of `-Cinstrument-coverage`Zalathar-3/+1
2024-03-10add comments explaining where post-mono const eval errors abort compilationRalf Jung-0/+3
2024-02-14clean up potential_query_instability with FxIndexMap and UnordMapyukang-2/+2
2024-02-06Rollup merge of #120602 - klensy:mono-comment, r=nnethercoteMatthias Krüger-1/+1
rustc_monomorphize: fix outdated comment in partition `max_cgu_count` was removed in https://github.com/rust-lang/rust/commit/51821515b3ccd7dd8f42ffd6a2eee536dcf7ddb0, but not comment (usage in `merge_codegen_units` was removed earlier). r? `@nnethercote`
2024-02-06Fix drop shim for AsyncFnOnce closure, AsyncFnMut shim for AsyncFn closureMichael Goulet-2/+2
2024-02-06Construct body for by-move coroutine closure outputMichael Goulet-1/+3
2024-02-06Build a shim to call async closures with different AsyncFn trait kindsMichael Goulet-0/+2
2024-02-03rustc_monomorphize: fix outdated comment in partitionklensy-1/+1
2024-01-21Rollup merge of #118811 - EbbDrop:is-sorted-by-bool, r=Mark-SimulacrumNadrieril-2/+2
Use `bool` instead of `PartiolOrd` as return value of the comparison closure in `{slice,Iteraotr}::is_sorted_by` Changes the function signature of the closure given to `{slice,Iteraotr}::is_sorted_by` to return a `bool` instead of a `PartiolOrd` as suggested by the libs-api team here: https://github.com/rust-lang/rust/issues/53485#issuecomment-1766411980. This means these functions now return true if the closure returns true for all the pairs of values.
2024-01-20Use bool instead of PartiolOrd in is_sorted_byEbbDrop-2/+2
2024-01-10Rename `{create,emit}_warning` as `{create,emit}_warn`.Nicholas Nethercote-1/+1
For consistency with `warn`/`struct_warn`, and also `{create,emit}_err`, all of which use an abbreviated form.
2023-12-24Remove more `Session` methods that duplicate `DiagCtxt` methods.Nicholas Nethercote-1/+1
2023-12-24Remove `Session` methods that duplicate `DiagCtxt` methods.Nicholas Nethercote-3/+3
Also add some `dcx` methods to types that wrap `TyCtxt`, for easier access.
2023-12-15Auto merge of #118770 - saethlin:fix-inline-never-uses, r=nnethercotebors-3/+9
Fix cases where std accidentally relied on inline(never) This PR increases the power of `-Zcross-crate-inline-threshold=always` so that it applies through `#[inline(never)]`. Note that though this is called "cross-crate-inlining" in this case especially it is _just_ lazy per-CGU codegen. The MIR inliner and LLVM still respect the attribute as much as they ever have. Trying to bootstrap with the new `-Zcross-crate-inline-threshold=always` change revealed two bugs: We have special intrinsics `assert_inhabited`, `assert_zero_valid`, and `assert_mem_uniniitalized_valid` which codegen backends will lower to nothing or a call to `panic_nounwind`. Since we may not have any call to `panic_nounwind` in MIR but emit one anyway, we need to specially tell `MirUsedCollector` about this situation. `#[lang = "start"]` is special-cased already so that `MirUsedCollector` will collect it, but then when we make it cross-crate-inlinable it is only assigned to a CGU based on whether `MirUsedCollector` saw a call to it, which of course we didn't. --- I started looking into this because https://github.com/rust-lang/rust/pull/118683 revealed a case where we were accidentally relying on a function being `#[inline(never)]`, and cranking up cross-crate-inlinability seems like a way to find other situations like that. r? `@nnethercote` because I don't like what I'm doing to the CGU partitioning code here but I can't come up with something much better
2023-12-14Fix cases where std accidentally relied on inline(never)Ben Kimock-3/+9
2023-12-13Add unstable `-Zdefault-hidden-visibility` cmdline flag for `rustc`.Lukasz Anforowicz-1/+1
The new flag has been described in the Major Change Proposal at https://github.com/rust-lang/compiler-team/issues/656
2023-11-21Fix `clippy::needless_borrow` in the compilerNilstrieb-3/+3
`x clippy compiler -Aclippy::all -Wclippy::needless_borrow --fix`. Then I had to remove a few unnecessary parens and muts that were exposed now.
2023-10-21coverage: Change query `codegened_and_inlined_items` to a plain functionZalathar-31/+0
This query has a name that sounds general-purpose, but in fact it has coverage-specific semantics, and (fortunately) is only used by coverage code. Because it is only ever called once (from one designated CGU), it doesn't need to be a query, and we can change it to a regular function instead.
2023-09-26subst -> instantiatelcnr-1/+1
2023-09-14treat host effect params as erased generics in codegenDeadbeef-2/+2
This fixes the changes brought to codegen tests when effect params are added to libcore, by not attempting to monomorphize functions that get the host param by being `const fn`.
2023-07-27Rollup merge of #113872 - nnethercote:tweak-cgu-sorting, r=pnkfelixMatthias Krüger-1/+1
Tweak CGU sorting in a couple of places. In `base.rs`, tweak how the CGU size interleaving works. Since #113777, it's much more common to have multiple CGUs with identical sizes. With the existing code these same-sized items ended up in the opposite-to-desired order due to the stable sorting. The code now starts with a reverse sort (like is done in `partitioning.rs`) which gives the behaviour we want. This doesn't matter much for perf, but makes profiles in `samply` look more like what we expect. In `partitioning.rs`, we can use `sort_by_key` instead of `sort_by_cached_key` because `CGU::size_estimate()` is cheap. (There is an identical CGU sort earlier in that function that already uses `sort_by_key`.) r? `@pnkfelix`
2023-07-23more clippy::style fixes:Matthias Krüger-4/+1
get_first single_char_add_str unnecessary_mut_passed manual_map manual_is_ascii_check
2023-07-23fix some clippy::style findingsMatthias Krüger-5/+6
comparison_to_empty iter_nth_zero for_kv_map manual_next_back redundant_pattern
2023-07-20Tweak CGU sorting in a couple of places.Nicholas Nethercote-1/+1
In `base.rs`, tweak how the CGU size interleaving works. Since #113777, it's much more common to have multiple CGUs with identical sizes. With the existing code these same-sized items ended up in the opposite-to-desired order due to the stable sorting. The code now starts with a reverse sort (like is done in `partitioning.rs`) which gives the behaviour we want. This doesn't matter much for perf, but makes profiles in `samply` look more like what we expect. In `partitioning.rs`, we can use `sort_by_key` instead of `sort_by_cached_key` because `CGU::size_estimate()` is cheap. (There is an identical CGU sort earlier in that function that already uses `sort_by_key`.)
2023-07-19Change the primary CGU merging algorithm.Nicholas Nethercote-14/+66
Instead of repeatedly merging the two smallest CGUs, we now use a merging algorithm that aims to minimize the duplication of inlined functions. `exa-0.10.1` was one benchmark that saw particularly good results. The old CGU stats: ``` INTERNALIZE - unique items: 2774 (1216 root + 1558 inlined), unique size: 122065 (77219 root + 44846 inlined) - placed items: 3834 (1216 root + 2618 inlined), placed size: 154552 (77219 root + 77333 inlined) - placed/unique items ratio: 1.38, placed/unique size ratio: 1.27 - CGUs: 16, mean size: 9659.5, sizes: [11791, 11634, 11173, 10987, 10939, 10507, 9992, 9813, 9593, 9580, 9030, 8447, 7975, 7961, 7876, 7254] ``` The new CGU stats: ``` INTERNALIZE - unique items: 2774 (1216 root + 1558 inlined), unique size: 122065 (77219 root + 44846 inlined) - placed items: 3626 (1216 root + 2410 inlined), placed size: 147201 (77219 root + 69982 inlined) - placed/unique items ratio: 1.31, placed/unique size ratio: 1.21 - CGUs: 16, mean size: 9200.1, sizes: [11634, 10939, 10227, 9555, 9178, 9167, 8879, 8804, 8604, 8603 (x3), 8602 (x2), 8601, 8600] ``` The difference is in the number of inlined items. There are 1558 unique inlined items. With the old algorithm these were placed 2618 times, resulting in 1060 duplicates. With the new algorithm these were placed 2410 times, resulting in 852 duplicates. Also, the mean CGU size dropped from 9659.5 to 9200.1, and the CGU size distribution tightened, with the biggest one a little smaller and the smallest ones a little bigger.
2023-07-19Split the CGU merging loop.Nicholas Nethercote-20/+30
It has two conditions. This commit splits it in two, one per condition. The next commit will change the first loop.
2023-07-19Add `MonoItemData::inlined`.Nicholas Nethercote-17/+13
2023-07-17Ignore unreachable inlined items in `debug_dump`.Nicholas Nethercote-30/+18
They're quite rare, and ignoring them simplifies things quite a bit, and further reduces the number of calls to `MonoItem::size_estimate` to the number of placed items (one per root item, and one or more per reachable inlined item).
2023-07-17Store item size estimate in `MonoItemData`.Nicholas Nethercote-13/+16
This means we call `MonoItem::size_estimate` (which involves a query) less often: just once per mono item, and then once more per inline item placement. After that we can reuse the stored value as necessary. This means `CodegenUnit::compute_size_estimate` is cheaper.
2023-07-17Introduce `MonoItemData`.Nicholas Nethercote-10/+15
It replaces `(Linkage, Visibility)`, making the code nicer. Plus the next commit will add another field.
2023-07-14refactor(rustc_middle): Substs -> GenericArgMahdi Dibaiee-3/+3
2023-07-08Rollup merge of #113390 - nnethercote:cgu-tweaks, r=wesleywiserMatthias Krüger-22/+22
CGU formation tweaks Minor improvements I found while trying out something bigger that didn't work out. r? ``@wesleywiser``
2023-07-06Diagnose unsorted CGUs.Nicholas Nethercote-1/+7
An assertion failure was reported in #112946. This extra information will help diagnose the problem.
2023-07-06Minor comment fix.Nicholas Nethercote-3/+3
2023-07-06Remove the field name from `MonoItemPlacement::SingleCgu`.Nicholas Nethercote-4/+4
It's needless verbosity.
2023-07-06Use `iter()` instead of `iter_mut()` in one place.Nicholas Nethercote-1/+1
2023-07-06Make `UsageMap::get_user_items` infallible.Nicholas Nethercote-14/+14
It's nicer this way.
2023-06-26Tweak thread names for CGU processing.Nicholas Nethercote-0/+3
For non-incremental builds on Unix, currently all the thread names look like `opt regex.f10ba03eb5ec7975-cgu.0`. But they are truncated by `pthread_setname` to `opt regex.f10ba`, hiding the numeric suffix that distinguishes them. This is really annoying when using a profiler like Samply. This commit changes these thread names to a form like `opt cgu.0`, which is much better.
2023-06-26Improve ordering and naming of CGUs for non-incremental builds.Nicholas Nethercote-6/+27
Currently there are two problems. First, the CGUS don't end up in size order. The merging loop does sort by size on each iteration, but we don't sort after the final merge, so typically there is one CGU out of place. (And sometimes we don't enter the merging loop at all, in which case they end up in random order.) Second, we then assign names that differ only by a numeric suffix, and then we sort them lexicographically by name, giving us an order like this: regex.f10ba03eb5ec7975-cgu.1 regex.f10ba03eb5ec7975-cgu.10 regex.f10ba03eb5ec7975-cgu.11 regex.f10ba03eb5ec7975-cgu.12 regex.f10ba03eb5ec7975-cgu.13 regex.f10ba03eb5ec7975-cgu.14 regex.f10ba03eb5ec7975-cgu.15 regex.f10ba03eb5ec7975-cgu.2 regex.f10ba03eb5ec7975-cgu.3 regex.f10ba03eb5ec7975-cgu.4 regex.f10ba03eb5ec7975-cgu.5 regex.f10ba03eb5ec7975-cgu.6 regex.f10ba03eb5ec7975-cgu.7 regex.f10ba03eb5ec7975-cgu.8 regex.f10ba03eb5ec7975-cgu.9 These two problems are really annoying when debugging and profiling the CGUs. This commit ensures CGUs are sorted by name *and* reverse sorted by size. This involves (a) one extra sort by size operation, and (b) padding the numeric indices with zeroes, e.g. `regex.f10ba03eb5ec7975-cgu.01`. (Note that none of this applies for incremental builds, where a different hash-based CGU naming scheme is used.)
2023-06-22Tweak CGU size estimate code.Nicholas Nethercote-7/+8
- Rename `create_size_estimate` as `compute_size_estimate`, because that makes more sense for the second and subsequent calls for each CGU. - Change `CodegenUnit::size_estimate` from `Option<usize>` to `usize`. We can still assert that `compute_size_estimate` is called first. - Move the size estimation for `place_mono_items` inside the function, for consistency with `merge_codegen_units`.
2023-06-22Merge root and inlined item placement.Nicholas Nethercote-71/+44
There's no longer any need for them to be separate, and putting them together reduces the amount of code.
2023-06-22Inline before merging CGUs.Nicholas Nethercote-12/+14
Because CGU merging relies on CGU sizes, but the CGU sizes before inlining aren't accurate. This requires tweaking how the sizes are updated during merging: if CGU A and B both have an inlined function F, then `size(A + B)` will be a little less than `size(A) + size(B)`, because `A + B` will only have one copy of F. Also, the minimum CGU size is increased because it now has to account for inlined functions. This change doesn't have much effect on compile perf, but it makes follow-on changes that involve more sophisticated reasoning about CGU sizes much easier.
2023-06-22Streamline some comments.Nicholas Nethercote-6/+5
2023-06-15Merge CGUs in a nicer way.Nicholas Nethercote-3/+1
2023-06-15Make `partition` more consistent.Nicholas Nethercote-14/+17
Always put the `create_size_estimate` calls and `debug_dump` calls within a timed scopes. This makes the four main steps look more similar to each other.
2023-06-15Fix bug in `mark_code_coverage_dead_code_cgus`.Nicholas Nethercote-13/+8
The comment says "Find the smallest CGU that has exported symbols and put the dead function stubs in that CGU". But the code sorts the CGUs by size (smallest first) and then searches them in reverse order, which means it will find the *largest* CGU that has exported symbols. The erroneous code was introduced in #92142. This commit changes it to use a simpler search, avoiding the sort, and fixes the bug in the process.