| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
Introduce `-Zvirtual-function-elimination` codegen flag
Fixes #68262
This PR adds a codegen flag `-Zvirtual-function-elimination` to enable the VFE optimization in LLVM. To make this work, additonal information has to be added to vtables ([`!vcall_visibility` metadata](https://llvm.org/docs/TypeMetadata.html#vcall-visibility-metadata) and a `typeid` of the trait). Furthermore, instead of just `load`ing functions, the [`llvm.type.checked.load` intrinsic](https://llvm.org/docs/LangRef.html#llvm-type-checked-load-intrinsic) has to be used to map functions to vtables.
For technical details of the changes, see the commit messages.
I also tested this flag on https://github.com/tock/tock on different boards to verify that this fixes the issue https://github.com/tock/tock/issues/2594. This flag is able to improve the size of the resulting binary by about 8k-9k bytes by removing the unused debug print functions.
[Rendered documentation update](https://github.com/flip1995/rust/blob/pk-vfe/src/doc/rustc/src/codegen-options/index.md#virtual-function-elimination)
|
|
|
|
Add the intrinsic
declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type)
This is used in the VFE optimization when lowering loading functions
from vtables to LLVM IR. The `metadata` is used to map the function to
all vtables this function could belong to. This ensures that functions
from vtables that might be used somewhere won't get removed.
|
|
And likewise for the `Const::val` method.
Because its type is called `ConstKind`. Also `val` is a confusing name
because `ConstKind` is an enum with seven variants, one of which is
called `Value`. Also, this gives consistency with `TyS` and `PredicateS`
which have `kind` fields.
The commit also renames a few `Const` variables from `val` to `c`, to
avoid confusion with the `ConstKind::Value` variant.
|
|
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
|
|
|
|
|
|
A pointer to address cast are often special-cased.
Introduce a dedicated cast kind to make them easy distinguishable.
|
|
clone_on_copy
useless_format
bind_instead_of_map
filter_map_identity
useless_conversion
map_flatten
unnecessary_unwrap
|
|
rustc_codegen_ssa: cleanup `AtomicOrdering`
* Remove unused `NotAtomic` ordering.
* Rename `Monotonic` to `Relaxed` - a Rust specific name.
* Derive copy and clone.
|
|
* Remove unused `NotAtomic` ordering.
* Rename `Monotonic` to `Relaxed` - a Rust specific name.
|
|
|
|
|
|
|
|
Like we have `add`/`sub` which are the `usize` version of `offset`, this adds the `usize` equivalent of `offset_from`. Like how `.add(d)` replaced a whole bunch of `.offset(d as isize)`, you can see from the changes here that it's fairly common that code actually knows the order between the pointers and *wants* a `usize`, not an `isize`.
As a bonus, this can do `sub nuw`+`udiv exact`, rather than `sub`+`sdiv exact`, which can be optimized slightly better because it doesn't have to worry about negatives. That's why the slice iterators weren't using `offset_from`, though I haven't updated that code in this PR because slices are so perf-critical that I'll do it as its own change.
This is an intrinsic, like `offset_from`, so that it can eventually be allowed in CTFE. It also allows checking the extra safety condition -- see the test confirming that CTFE catches it if you pass the pointers in the wrong order.
|
|
The reverse postorder, unlike preorder, is now cached inside the MIR
body. Code generation uses reverse postorder anyway, so it might be
a small perf improvement to use it here as well.
|
|
Reduce duplication of RPO calculation of mir
Computing the RPO of mir is not a low-cost thing, but it is duplicate in many places. In particular the `iterate_to_fixpoint` method which is called multiple times when computing the data flow.
This PR reduces the number of times the RPO is recalculated as much as possible, which should save some compile time.
|
|
add `postorder_cache` to mir Body
add `ReversePostorderCache` struct
correct struct name and comments
|
|
Eliminate duplication code of building panic langcall during codegen
From the FIXME in the `codegen_panic_intrinsic` func.
|
|
|
|
|
|
|
|
level of struct nesting.
|
|
|
|
|
|
initialized scalars can special case them.
|
|
|
|
|
|
|
|
There are a few places were we have to construct it, though, and a few
places that are more invasive to change. To do this, we create a
constructor with a long obvious name.
|
|
Fix ICE when using Box<T, A>, again
Sequel to #94043, fixes #94835.
|
|
|
|
This commit makes `AdtDef` use `Interned`. Much the commit is tedious
changes to introduce getter functions. The interesting changes are in
`compiler/rustc_middle/src/ty/adt.rs`.
|
|
|
|
Revert "Auto merge of #92419 - erikdesjardins:coldland, r=nagisa"
Should fix (untested) #94390
Reopens #46515, #87055
r? `@ehuss`
|
|
|
|
This reverts commit 4f49627c6fe2a32d1fed6310466bb0e1c535c0c0, reversing
changes made to 028c6f1454787c068ff5117e9000a1de4fd98374.
|
|
|
|
Partially move cg_ssa towards using a single builder
Not all codegen backends can handle hopping between blocks well. For example Cranelift requires blocks to be terminated before switching to building a new block. Rust-gpu requires a `RefCell` to allow hopping between blocks and cg_gcc currently has a buggy implementation of hopping between blocks. This PR reduces the amount of cases where cg_ssa switches between blocks before they are finished and mostly fixes the block hopping in cg_gcc. (~~only `scalar_to_backend` doesn't handle it correctly yet in cg_gcc~~ fixed that one.)
`@antoyo` please review the cg_gcc changes.
|
|
|
|
Move ty::print methods to Drop-based scope guards
Primary goal is reducing codegen of the TLS access for each closure, which shaves ~3 seconds of bootstrap time over rustc as a whole.
|
|
|
|
|
|
|
|
Rollup of 14 pull requests
Successful merges:
- #93580 (Stabilize pin_static_ref.)
- #93639 (Release notes for 1.59)
- #93686 (core: Implement ASCII trim functions on byte slices)
- #94002 (rustdoc: Avoid duplicating macros in sidebar)
- #94019 (removing architecture requirements for RustyHermit)
- #94023 (adapt static-nobundle test to use llvm-nm)
- #94091 (Fix rustdoc const computed value)
- #94093 (Fix pretty printing of enums without variants)
- #94097 (Add module-level docs for `rustc_middle::query`)
- #94112 (Optimize char_try_from_u32)
- #94113 (document rustc_middle::mir::Field)
- #94122 (Fix miniz_oxide types showing up in std docs)
- #94142 (rustc_typeck: adopt let else in more places)
- #94146 (Adopt let else in more places)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
|
|
Adopt let else in more places
Continuation of #89933, #91018, #91481, #93046, #93590, #94011.
I have extended my clippy lint to also recognize tuple passing and match statements. The diff caused by fixing it is way above 1 thousand lines. Thus, I split it up into multiple pull requests to make reviewing easier. This is the biggest of these PRs and handles the changes outside of rustdoc, rustc_typeck, rustc_const_eval, rustc_trait_selection, which were handled in PRs #94139, #94142, #94143, #94144.
|
|
Guard against unwinding in cleanup code
Currently the only safe guard we have against double unwind is the panic count (which is local to Rust). When double unwinds indeed happen (e.g. C++ exception + Rust panic, or two C++ exceptions), then the second unwind actually goes through and the first unwind is leaked. This can cause UB. cc rust-lang/project-ffi-unwind#6
E.g. given the following C++ code:
```c++
extern "C" void foo() {
throw "A";
}
extern "C" void execute(void (*fn)()) {
try {
fn();
} catch(...) {
}
}
```
This program is well-defined to terminate:
```c++
struct dtor {
~dtor() noexcept(false) {
foo();
}
};
void a() {
dtor a;
dtor b;
}
int main() {
execute(a);
return 0;
}
```
But this Rust code doesn't catch the double unwind:
```rust
extern "C-unwind" {
fn foo();
fn execute(f: unsafe extern "C-unwind" fn());
}
struct Dtor;
impl Drop for Dtor {
fn drop(&mut self) {
unsafe { foo(); }
}
}
extern "C-unwind" fn a() {
let _a = Dtor;
let _b = Dtor;
}
fn main() {
unsafe { execute(a) };
}
```
To address this issue, this PR adds an unwind edge to an abort block, so that the Rust example aborts. This is similar to how clang guards against double unwind (except clang calls terminate per C++ spec and we abort).
The cost should be very small; it's an additional trap instruction (well, two for now, since we use TrapUnreachable, but that's a different issue) for each function with landing pads; if LLVM gains support to encode "abort/terminate" info directly in LSDA like GCC does, then it'll be free. It's an additional basic block though so compile time may be worse, so I'd like a perf run.
r? `@ghost`
`@rustbot` label: F-c_unwind
|
|
|