| Age | Commit message (Collapse) | Author | Lines |
|
Avoid repetition on “use of unstable library feature 'rustc_private'”
This PR fixes the error by only emitting it when the span contains a real file (is not inside a macro) - and making sure it's emitted only once per span.
The first check was needed because spans-within-macros seem to differ a lot and "fixing" them to the real location is not trivial (and the method that does this is private to another module). It also feels like there always will be an error on import, with the real file name, so not sure there's a point to re-emit the same error at macro use.
Fix #44953.
|
|
|
|
|
|
|
|
Thanks-to: @kennytm
|
|
|
|
|
|
is very repetitive
|
|
Add short error message-format
Fixes #42653.
|
|
|
|
|
|
|
|
address more FIXME whose associated issues were marked as closed
part of #44366
|
|
cleanup: rustc doesn't use an external archiver
cc https://github.com/rust-lang/rust/pull/45090
r? @alexcrichton
|
|
rustc: Allow target-specific default cgus
Some targets, like msp430 and nvptx, don't work with multiple codegen units
right now for bugs or fundamental reasons. To expose this allow targets to
express a default.
Closes #45000
|
|
|
|
Some targets, like msp430 and nvptx, don't work with multiple codegen units
right now for bugs or fundamental reasons. To expose this allow targets to
express a default.
Closes #45000
|
|
|
|
rustc: Don't inline in CGUs at -O0
This commit tweaks the behavior of inlining functions into multiple codegen
units when rustc is compiling in debug mode. Today rustc will unconditionally
treat `#[inline]` functions by translating them into all codegen units that
they're needed within, marking the linkage as `internal`. This commit changes
the behavior so that in debug mode (compiling at `-O0`) rustc will instead only
translate `#[inline]` functions into *one* codegen unit, forcing all other
codegen units to reference this one copy.
The goal here is to improve debug compile times by reducing the amount of
translation that happens on behalf of multiple codegen units. It was discovered
in #44941 that increasing the number of codegen units had the adverse side
effect of increasing the overal work done by the compiler, and the suspicion
here was that the compiler was inlining, translating, and codegen'ing more
functions with more codegen units (for example `String` would be basically
inlined into all codegen units if used). The strategy in this commit should
reduce the cost of `#[inline]` functions to being equivalent to one codegen
unit, which is only translating and codegen'ing inline functions once.
Collected [data] shows that this does indeed improve the situation from [before]
as the overall cpu-clock time increases at a much slower rate and when pinned to
one core rustc does not consume significantly more wall clock time than with one
codegen unit.
One caveat of this commit is that the symbol names for inlined functions that
are only translated once needed some slight tweaking. These inline functions
could be translated into multiple crates and we need to make sure the symbols
don't collideA so the crate name/disambiguator is mixed in to the symbol name
hash in these situations.
[data]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334880911
[before]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334583384
|
|
rustc: Reduce default CGUs to 16
Rationale explained in the included comment as well as #44941
|
|
Add -Zmutable-noalias flag
We disabled noalias on mutable references a long time ago when it was clear that llvm was incorrectly handling this in relation to unwinding edges.
Since then, a few things have happened:
* llvm has cleaned up a bunch of the issues (I'm told)
* we've added a nounwind codegen option
As such, I would like to add this -Z flag so that we can evaluate if the codegen bugs still exist, and if this significantly affects the codegen of different projects, with an eye towards permanently re-enabling it (or at least making it a stable option).
|
|
Document that `-C ar=PATH` doesn't do anything
Are there any plans to use an external archiver in the future?
IIRC, it was used before, but its use was replaced with LLVM's built-in archive management machinery. I can't found a relevant PR though. EDIT: Found it - https://github.com/rust-lang/rust/pull/26926!
The `-C` option is stable so it still can't be removed right away even if there are no plans to use it (but maybe it can be deprecated?).
Target specifications have a field for archiver as well, which is unused too (these ones are unstable, so I guess it can be removed).
r? @alexcrichton
|
|
This commit tweaks the behavior of inlining functions into multiple codegen
units when rustc is compiling in debug mode. Today rustc will unconditionally
treat `#[inline]` functions by translating them into all codegen units that
they're needed within, marking the linkage as `internal`. This commit changes
the behavior so that in debug mode (compiling at `-O0`) rustc will instead only
translate `#[inline]` functions into *one* codegen unit, forcing all other
codegen units to reference this one copy.
The goal here is to improve debug compile times by reducing the amount of
translation that happens on behalf of multiple codegen units. It was discovered
in #44941 that increasing the number of codegen units had the adverse side
effect of increasing the overal work done by the compiler, and the suspicion
here was that the compiler was inlining, translating, and codegen'ing more
functions with more codegen units (for example `String` would be basically
inlined into all codegen units if used). The strategy in this commit should
reduce the cost of `#[inline]` functions to being equivalent to one codegen
unit, which is only translating and codegen'ing inline functions once.
Collected [data] shows that this does indeed improve the situation from [before]
as the overall cpu-clock time increases at a much slower rate and when pinned to
one core rustc does not consume significantly more wall clock time than with one
codegen unit.
One caveat of this commit is that the symbol names for inlined functions that
are only translated once needed some slight tweaking. These inline functions
could be translated into multiple crates and we need to make sure the symbols
don't collideA so the crate name/disambiguator is mixed in to the symbol name
hash in these situations.
[data]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334880911
[before]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334583384
|
|
update FIXME(#6298) to point to open issue 15020
update FIXME(#6268) to point to RFC 811
update FIXME(#10520) to point to RFC 1751
remove FIXME for emscripten issue 4563 and include target in `test_estimate_scaling_factor`
remove FIXME(#18207) since node_id isn't used for `ref` pattern analysis
remove FIXME(#6308) since DST was implemented in #12938
remove FIXME(#2658) since it was decided to not reorganize module
remove FIXME(#20590) since it was decided to stay conservative with projection types
remove FIXME(#20297) since it was decided that solving the issue is unnecessary
remove FIXME(#27086) since closures do correspond to structs now
remove FIXME(#13846) and enable `function_sections` for windows
remove mention of #22079 in FIXME(#22079) since this is a general FIXME
remove FIXME(#5074) since the restriction on borrow were lifted
|
|
|
|
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
|
|
Rationale explained in the included comment as well as #44941
|
|
Don't use remapped path when loading modules and include files
Fixes bug reported in https://github.com/rust-lang/rust/issues/41555#issuecomment-327866056.
cc @michaelwoerister
|
|
|
|
This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little
|
|
|
|
rustc: Default 32 codegen units at O0
This commit changes the default of rustc to use 32 codegen units when compiling
in debug mode, typically an opt-level=0 compilation. Since their inception
codegen units have matured quite a bit, gaining features such as:
* Parallel translation and codegen enabling codegen units to get worked on even
more quickly.
* Deterministic and reliable partitioning through the same infrastructure as
incremental compilation.
* Global rate limiting through the `jobserver` crate to avoid overloading the
system.
The largest benefit of codegen units has forever been faster compilation through
parallel processing of modules on the LLVM side of things, using all the cores
available on build machines that typically have many available. Some downsides
have been fixed through the features above, but the major downside remaining is
that using codegen units reduces opportunities for inlining and optimization.
This, however, doesn't matter much during debug builds!
In this commit the default number of codegen units for debug builds has been
raised from 1 to 32. This should enable most `cargo build` compiles that are
bottlenecked on translation and/or code generation to immediately see speedups
through parallelization on available cores.
Work is being done to *always* enable multiple codegen units (and therefore
parallel codegen) but it requires #44841 at least to be landed and stabilized,
but stay tuned if you're interested in that aspect!
|
|
This is the name the `gcc` crate has moved to
|
|
This commit changes the default of rustc to use 32 codegen units when compiling
in debug mode, typically an opt-level=0 compilation. Since their inception
codegen units have matured quite a bit, gaining features such as:
* Parallel translation and codegen enabling codegen units to get worked on even
more quickly.
* Deterministic and reliable partitioning through the same infrastructure as
incremental compilation.
* Global rate limiting through the `jobserver` crate to avoid overloading the
system.
The largest benefit of codegen units has forever been faster compilation through
parallel processing of modules on the LLVM side of things, using all the cores
available on build machines that typically have many available. Some downsides
have been fixed through the features above, but the major downside remaining is
that using codegen units reduces opportunities for inlining and optimization.
This, however, doesn't matter much during debug builds!
In this commit the default number of codegen units for debug builds has been
raised from 1 to 32. This should enable most `cargo build` compiles that are
bottlenecked on translation and/or code generation to immediately see speedups
through parallelization on available cores.
Work is being done to *always* enable multiple codegen units (and therefore
parallel codegen) but it requires #44841 at least to be landed and stabilized,
but stay tuned if you're interested in that aspect!
|
|
|
|
pnkfelix:debugflags-borrowckmir-implies-emitendregions, r=arielb1
Make `-Z borrowck-mir` imply that `EndRegion`'s should be emitted.
Before this change, the `-Z borrowck-mir` flag is useless if you do not also pass `-Z emit-end-regions`.
So, in the same spirit as f2892ad281cb11421ebae741d698e0af14d3ecf6, make `-Z borrowck-mir` also emit `EndRegion` statements. (This will hopefully avoid some initial speed bumps for new-comers helping out with NLL.)
|
|
`--cap-lints allow` switches off `can_emit_warnings`
This boolean field on the error `Handler` is toggled to silence
warnings when `-A warnings` is passed. (This is actually a separate
mechanism from the global lint level—whether there's some redundancy
to be factored away here is an important question, but not one we
concern ourselves with in this commit.) But the same rationale
applies for `--cap-lints allow`. In particular, this makes the "soft"
feature-gate warning introduced in 8492ad24 (which is not a lint, but
just calls `struct_span_warn`) not pollute the builds of dependent
crates.
Thanks to @kennytm for pointing out the potential of
`can_emit_warnings` for this purpose.
Resolves #44213.
|
|
|
|
Change from native-static-deps to native-static-libs.
|
|
Add proper help line for `-C inline threshold`
Looks like someone accidentally some words when adding this.
This also remove a period on a different help line for consistency, as no options have a period.
|
|
|
|
This boolean field on the error `Handler` is toggled to silence
warnings when `-A warnings` is passed. (This is actually a separate
mechanism from the global lint level—whether there's some redundancy
to be factored away here is an important question, but not one we
concern ourselves with in this commit.) But the same rationale
applies for `--cap-lints allow`. In particular, this makes the "soft"
feature-gate warning introduced in 8492ad24 (which is not a lint, but
just calls `struct_span_warn`) not pollute the builds of dependent
crates.
Thanks to @kennytm for pointing out the potential of
`can_emit_warnings` for this purpose.
Resolves #44213.
|
|
This commit removes the `dep_graph` field from the `Session` type according to
issue #44390. Most of the fallout here was relatively straightforward and the
`prepare_session_directory` function was rejiggered a bit to reuse the results
in the later-called `load_dep_graph` function.
Closes #44390
|
|
|
|
Also remove a period on a different help line for consistency
|
|
CrateStore access in tcx.
|
|
|
|
This way the miri test suite does not have to be updated to explcitly
request `-Z emit-end-regions`.
|
|
The main intent is to fix cases where EndRegion emission is believed
to be causing excess peak memory pressure.
It may also be a welcome change to people inspecting the MIR output
who find the EndRegions to be a distraction.
|
|
Compact display of static lib dependencies
Fixes #33173
Instead of displaying one dependency per line, I've changed the format to display them all in one line.
As a bonus they're in format of linker flags (`-lfoo`), so the output can be copy&pasted if one is actually going to link as suggested.
|