| Age | Commit message (Collapse) | Author | Lines |
|
This commit is an implementation of adding custom sections to wasm artifacts in
rustc. The intention here is to expose the ability of the wasm binary format to
contain custom sections with arbitrary user-defined data. Currently neither our
version of LLVM nor LLD supports this so the implementation is currently custom
to rustc itself.
The implementation here is to attach a `#[wasm_custom_section = "foo"]`
attribute to any `const` which has a type like `[u8; N]`. Other types of
constants aren't supported yet but may be added one day! This should hopefully
be enough to get off the ground with *some* custom section support.
The current semantics are that any constant tagged with `#[wasm_custom_section]`
section will be *appended* to the corresponding section in the final output wasm
artifact (and this affects dependencies linked in as well, not just the final
crate). This means that whatever is interpreting the contents must be able to
interpret binary-concatenated sections (or each constant needs to be in its own
custom section).
To test this change the existing `run-make` test suite was moved to a
`run-make-fulldeps` folder and a new `run-make` test suite was added which
applies to all targets by default. This test suite currently only has one test
which only runs for the wasm target (using a node.js script to use `WebAssembly`
in JS to parse the wasm output).
|
|
|
|
|
|
This commit tweaks the behavior of inlining functions into multiple codegen
units when rustc is compiling in debug mode. Today rustc will unconditionally
treat `#[inline]` functions by translating them into all codegen units that
they're needed within, marking the linkage as `internal`. This commit changes
the behavior so that in debug mode (compiling at `-O0`) rustc will instead only
translate `#[inline]` functions into *one* codegen unit, forcing all other
codegen units to reference this one copy.
The goal here is to improve debug compile times by reducing the amount of
translation that happens on behalf of multiple codegen units. It was discovered
in #44941 that increasing the number of codegen units had the adverse side
effect of increasing the overal work done by the compiler, and the suspicion
here was that the compiler was inlining, translating, and codegen'ing more
functions with more codegen units (for example `String` would be basically
inlined into all codegen units if used). The strategy in this commit should
reduce the cost of `#[inline]` functions to being equivalent to one codegen
unit, which is only translating and codegen'ing inline functions once.
Collected [data] shows that this does indeed improve the situation from [before]
as the overall cpu-clock time increases at a much slower rate and when pinned to
one core rustc does not consume significantly more wall clock time than with one
codegen unit.
One caveat of this commit is that the symbol names for inlined functions that
are only translated once needed some slight tweaking. These inline functions
could be translated into multiple crates and we need to make sure the symbols
don't collideA so the crate name/disambiguator is mixed in to the symbol name
hash in these situations.
[data]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334880911
[before]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334583384
|
|
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
|
|
|
|
|
|
This commit shuffles around some CLI flags of the compiler to some more stable
locations with some renamings. The changes made were:
* The `-v` flag has been repurposes as the "verbose" flag. The version flag has
been renamed to `-V`.
* The `-h` screen has been split into two parts. Most top-level options (not
all) show with `-h`, and the remaining options (generally obscure) can be
shown with `--help -v` which is a "verbose help screen"
* The `-V` flag (version flag now) has lost its argument as it is now requested
with `rustc -vV` "verbose version".
* The `--emit` option has had its `ir` and `bc` variants renamed to `llvm-ir`
and `llvm-bc` to emphasize that they are LLVM's IR/bytecode.
* The `--emit` option has grown a new variant, `dep-info`, which subsumes the
`--dep-info` CLI argument. The `--dep-info` flag is now deprecated.
* The `--parse-only`, `--no-trans`, and `--no-analysis` flags have
moved behind the `-Z` family of flags.
* The `--debuginfo` and `--opt-level` flags were moved behind the top-level `-C`
flag.
* The `--print-file-name` and `--print-crate-name` flags were moved behind one
global `--print` flag which now accepts one of `crate-name`, `file-names`, or
`sysroot`. This global `--print` flag is intended to serve as a mechanism for
learning various metadata about the compiler itself.
No warnings are currently enabled to allow tools like Cargo to have time to
migrate to the new flags before spraying warnings to all users.
|
|
|