| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
|
|
A lot of places had special handling just in case they would get an
allocator module even though most of these places could never get one or
would have a trivial implementation for the allocator module. Moving all
handling of the allocator module to a single place simplifies things a
fair bit.
|
|
Making it participate in LTO would be incorrect if you compile a crate
as both a dylib (which needs it) and rlib (which must not include it) in
the same rustc invocation. With linker plugin LTO, the allocator shim
will still participate in LTO as it is safe to do so in that case.
|
|
|
|
Couple of codegen_fn_attrs improvements
As noted in https://github.com/rust-lang/rust/pull/144678#discussion_r2245060329 here is no need to keep link_name and export_name separate, which the third commit fixes by merging them. The second commit removes some dead code and the first commit merges two ifs with equivalent conditions. The last commit is an unrelated change which removes an unused `feature(autodiff)`.
|
|
|
|
Couple of minor cleanups
|
|
|
|
|
|
|
|
Remove usage of Any, reduce visibility of fields and remove unused
backend arguments.
|
|
Various refactors to the LTO handling code
In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
|
|
default data address space
|
|
As opposed to sending a message to the coordinator thread.
|
|
|
|
rustc_std_internal_symbol is meant to call functions from crates where
there is no direct dependency on said crate. As they either have to be
added to symbols.o or rustc has to introduce an implicit dependency on
them to avoid linker errors. The latter is done for some things like the
panic runtime, but adding these symbols to symbols.o allows removing
those implicit dependencies.
|
|
|
|
r=workingjubilee,saethlin
Move metadata object generation for dylibs to the linker code
This deduplicates some code between codegen backends and may in the future allow adding extra metadata that is only known at link time.
Prerequisite of https://github.com/rust-lang/rust/issues/96708.
|
|
|
|
This deduplicates some code between codegen backends and may in the
future allow adding extra metadata that is only known at link time.
|
|
And move passing it to the linker to the driver code.
|
|
|
|
make `rustc_attr_parsing` less dominant in the rustc crate graph
It has/had a glob re-export of `rustc_attr_data_structures`, which is a crate much lower in the graph, and a lot of crates were using it *just* (or *mostly*) for that re-export, while they can rely on `rustc_attr_data_structures` directly.
Previous graph:

Graph with this PR:

The first commit keeps the re-export, and just changes the dependency if possible. The second commit is the "breaking change" which removes the re-export, and "explicitly" adds the `rustc_attr_data_structures` dependency where needed. It also switches over some src/tools/*.
The second commit is actually a lot more involved than I expected. Please let me know if it's a better idea to back it out and just keep the first commit.
|
|
|
|
Unfortunately, multiple people are reporting linker warnings related to
`__rust_no_alloc_shim_is_unstable` after this change. The solution isn't
quite clear yet, let's revert to green for now, and try a reland with a
determined solution for `__rust_no_alloc_shim_is_unstable`.
This reverts commit c8b7f32434c0306db5c1b974ee43443746098a92, reversing
changes made to 667247db71ea18c4130dd018d060e7f09d589490.
|
|
|
|
|
|
|
|
|
|
Prepend temp files with per-invocation random string to avoid temp filename conflicts
https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management.
Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name:
```
#[inline(never)]
#[cfg(any(rpass1, rpass3))]
fn a() -> i32 {
0
}
#[cfg(any(cfail2))]
fn a() -> i32 {
1
}
fn main() {
evil::evil();
assert_eq!(a(), 0);
}
mod evil {
#[cfg(any(rpass1, rpass3))]
pub fn evil() {
unsafe {
std::arch::asm!("/* */");
}
}
#[cfg(any(cfail2))]
pub fn evil() {
unsafe {
std::arch::asm!("missing");
}
}
}
```
Session 1 (`rpass1`):
* Type-check, borrow-check, etc.
* Serialize the dep graph to the incremental working directory `.../s-...-working/`.
* Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd.
* Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`.
* Save-temps option means we don't delete `XXX.rgcu.o`.
* Link the binary and stuff.
* Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name).
Session 2 (`cfail2`):
* Load artifacts from the previous *finalized* incremental session, namely the dep graph.
* Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red.
* Serialize the dep graph to the incremental working directory `.../s-...-working/`.
* Codegen object file to a temp file `XXX.rcgu.o`. **HERE IS THE PROBLEM**: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory.
* Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the *previous* session is the last finalized session.
Session 3 (`rpass3`):
* Load artifacts from the previous *finalized* incremental session, namely the dep graph. NOTE that this is from session 1.
* All the dep graph nodes are green since we are basically replaying session 1.
* codegen object file `XXX.o`, which is detected as *reused* from session 1 since dep nodes were green. That means we **reuse** `XXX.o` which had been dirtied from session 2.
* Link the binary and stuff.
This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1.
At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst **unsound**.
This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example).
---
This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not *deterministic*, but temporary files are transient anyways, so I don't believe this is a problem.
That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc.
Fixes https://github.com/rust-lang/rust/issues/139407
[^1]: https://github.com/rust-lang/rust/blob/175dcc7773d65c1b1542c351392080f48c05799f/compiler/rustc_fs_util/src/lib.rs#L60
[^2]: https://github.com/rust-lang/rust/blob/175dcc7773d65c1b1542c351392080f48c05799f/compiler/rustc_incremental/src/persist/fs.rs#L1-L40
|
|
|
|
|
|
|
|
Only use the new node hashmap for anonymous nodes
This is a rebase of https://github.com/rust-lang/rust/pull/112469.
cc `@cjgillot`
|
|
Avoid no-op unlink+link dances in incr comp
Incremental compilation scales quite poorly with the number of CGUs. This PR improves one reason for that.
The incr comp process hard-links all the files from an old session into a new one, then it runs the backend, which may just hard-link the new session files into the output directory. Then codegen hard-links all the output files back to the new session directory.
This PR (perhaps unimaginatively) fixes the silliness that ensues in the last step. The old `link_or_copy` implementation would be passed pairs of paths which are already the same inode, then it would blindly delete the destination and re-create the hard-link that it just deleted. This PR lets us skip both those operations. We don't skip the other two hard-links.
`cargo +stage1 b && touch crates/core/main.rs && strace -cfw -elink,linkat,unlink,unlinkat cargo +stage1 b` before and then after on `ripgrep-13.0.0`:
```
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
52.56 0.024950 25 978 485 unlink
34.38 0.016318 22 727 linkat
13.06 0.006200 24 249 unlinkat
------ ----------- ----------- --------- --------- ----------------
100.00 0.047467 24 1954 485 total
```
```
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
42.83 0.014521 57 252 unlink
38.41 0.013021 26 486 linkat
18.77 0.006362 25 249 unlinkat
------ ----------- ----------- --------- --------- ----------------
100.00 0.033904 34 987 total
```
This reduces the number of hard-links that are causing perf troubles, noted in https://github.com/rust-lang/rust/issues/64291 and https://github.com/rust-lang/rust/issues/137560
|
|
|
|
|
|
Continuing the work from #137350.
Removes the unused methods: `expect_variant`, `expect_field`,
`expect_foreign_item`.
Every method gains a `hir_` prefix.
|
|
The embedded bitcode should always be prepared for LTO/ThinLTO
Fixes #115344. Fixes #117220.
There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`.
When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module.
This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`.
r? nikic
|
|
|
|
note: compiler compiles but librustdoc and clippy don't
|
|
|
|
Pass through of target features to llvm-bitcode-linker and handling them
When using the llvm-bitcode-linker (`linker-flavor=llbc`) target-features are not passed through and are not handled by it.
The llvm-bitcode-linker is mainly used as a self contained linker to link llvm bitcode for the nvptx64 target. It uses `llvm-link`, `opt` and `llc` internally. To produce a `.ptx` file of a specific ptx-version it is necessary to pass the version to llc with the `--mattr` option. Without explicitly setting it, the emitted `.ptx`-version is the minimum supported version of the `--target-cpu`.
I would like to be able to explicitly set the ptx version as [some llvm problems only occur in earlier `.ptx`-versions](https://github.com/llvm/llvm-project/issues/112998).
Therefore this pull request adds support for passing target features to llvm-bitcode-linker and handling them.
I was not quite sure if adding these features to `rustc_target/src/target_features.rs` is necessary or not. If so I will gladly add these.
r? ``@kjetilkjeka``
|
|
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information
- Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
|
|
Bitcode linkers like llvm-bitcode-linker or bpf linker hand over the target features to llvm during link stage. During link stage the `TyCtxt` is already gone so it is not possible to create a query for the global backend features any longer. The features preserved in `Session.target_features` only incorporate target features known to rustc. This would contradict with the behaviour during codegen stage which also passes target features to llvm which are unknown to rustc.
This commit adds target features as a field to the `CrateInfo` struct and queries the target features in its new function. This way the target features are preserved beyond tcx and available at link stage.
To make sure the `global_backend_features` query is always registered even if the CodegenBackend does not register it, this registration is added to the `provide`function of the `rustc_codegen_ssa` crate.
|
|
|
|
|
|
|