| Age | Commit message (Collapse) | Author | Lines |
|
`#[collapse_debuginfo]`
`-Z debug-macros` is "stabilized" by enabling it by default and removing.
`-Z collapse-macro-debuginfo` is stabilized as `-C collapse-macro-debuginfo`.
It now supports all typical boolean values (`parse_opt_bool`) in addition to just yes/no.
Default value of `collapse_debuginfo` was changed from `false` to `external` (i.e. collapsed if external, not collapsed if local).
`#[collapse_debuginfo]` attribute without a value is no longer supported to avoid guessing the default.
|
|
Stop using LLVM struct types for alloca
The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation.
Split out from #121577.
r? `@ghost`
|
|
Add simple async drop glue generation
This is a prototype of the async drop glue generation for some simple types. Async drop glue is intended to behave very similar to the regular drop glue except for being asynchronous. Currently it does not execute synchronous drops but only calls user implementations of `AsyncDrop::async_drop` associative function and awaits the returned future. It is not complete as it only recurses into arrays, slices, tuples, and structs and does not have same sensible restrictions as the old `Drop` trait implementation like having the same bounds as the type definition, while code assumes their existence (requires a future work).
This current design uses a workaround as it does not create any custom async destructor state machine types for ADTs, but instead uses types defined in the std library called future combinators (deferred_async_drop, chain, ready_unit).
Also I recommend reading my [explainer](https://zetanumbers.github.io/book/async-drop-design.html).
This is a part of the [MCP: Low level components for async drop](https://github.com/rust-lang/compiler-team/issues/727) work.
Feature completeness:
- [x] `AsyncDrop` trait
- [ ] `async_drop_in_place_raw`/async drop glue generation support for
- [x] Trivially destructible types (integers, bools, floats, string slices, pointers, references, etc.)
- [x] Arrays and slices (array pointer is unsized into slice pointer)
- [x] ADTs (enums, structs, unions)
- [x] tuple-like types (tuples, closures)
- [ ] Dynamic types (`dyn Trait`, see explainer's [proposed design](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#async-drop-glue-for-dyn-trait))
- [ ] coroutines (https://github.com/rust-lang/rust/pull/123948)
- [x] Async drop glue includes sync drop glue code
- [x] Cleanup branch generation for `async_drop_in_place_raw`
- [ ] Union rejects non-trivially async destructible fields
- [ ] `AsyncDrop` implementation requires same bounds as type definition
- [ ] Skip trivially destructible fields (optimization)
- [ ] New [`TyKind::AdtAsyncDestructor`](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#adt-async-destructor-types) and get rid of combinators
- [ ] [Synchronously undroppable types](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#exclusively-async-drop)
- [ ] Automatic async drop at the end of the scope in async context
|
|
|
|
|
|
|
|
do not add prolog for variadic naked functions
fixes #99858
|
|
fixes #99858
|
|
|
|
|
|
|
|
`OperandRef` does)
|
|
I added this back in 111999, but I no longer think it's a good idea
- It had to get scaled back to only power-of-two things to not break a bunch of targets
- LLVM seems to be getting better at memcpy removal anyway
- Introducing vector instructions has seemed to sometimes (115515) make autovectorization worse
So this removes it from the codegen crates entirely, and instead just tries to use <https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/builder/trait.BuilderMethods.html#method.typed_place_copy> instead of direct `memcpy` so things will still use load/store for immediates.
|
|
|
|
|
|
Rename `expose_addr` to `expose_provenance`
`expose_addr` is a bad name, an address is just a number and cannot be exposed. The operation is actually about the provenance of the pointer.
This PR thus changes the name of the method to `expose_provenance` without changing its return type. There is sufficient precedence for returning a useful value from an operation that does something else without the name indicating such, e.g. [`Option::insert`](https://doc.rust-lang.org/nightly/std/option/enum.Option.html#method.insert) and [`MaybeUninit::write`](https://doc.rust-lang.org/nightly/std/mem/union.MaybeUninit.html#method.write).
Returning the address is merely convenient, not a fundamental part of the operation. This is implied by the fact that integers do not have provenance since
```rust
let addr = ptr.addr();
ptr.expose_provenance();
let new = ptr::with_exposed_provenance(addr);
```
must behave exactly like
```rust
let addr = ptr.expose_provenance();
let new = ptr::with_exposed_provenance(addr);
```
as the result of `ptr.expose_provenance()` and `ptr.addr()` is the same integer. Therefore, this PR removes the `#[must_use]` annotation on the function and updates the documentation to reflect the important part.
~~An alternative name would be `expose_provenance`. I'm not at all opposed to that, but it makes a stronger implication than we might want that the provenance of the pointer returned by `ptr::with_exposed_provenance`[^1] is the same as that what was exposed, which is not yet specified as such IIUC. IMHO `expose` does not make that connection.~~
A previous version of this PR suggested `expose` as name, libs-api [decided on](https://github.com/rust-lang/rust/pull/122964#issuecomment-2033194319) `expose_provenance` to keep the symmetry with `with_exposed_provenance`.
CC `@RalfJung`
r? libs-api
[^1]: I'm using the new name for `from_exposed_addr` suggested by #122935 here.
|
|
Fix some unsoundness with PassMode::Cast ABI
Fixes #122617
Reviewable commit-by-commit. More info in each commit message.
|
|
|
|
rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
|
|
Rollup of 8 pull requests
Successful merges:
- #123198 (Add fn const BuildHasherDefault::new)
- #123226 (De-LLVM the unchecked shifts [MCP#693])
- #123302 (Make sure to insert `Sized` bound first into clauses list)
- #123348 (rustdoc: add a couple of regression tests)
- #123362 (Check that nested statics in thread locals are duplicated per thread.)
- #123368 (CFI: Support non-general coroutines)
- #123375 (rustdoc: synthetic auto trait impls: accept unresolved region vars for now)
- #123378 (Update sysinfo to 0.30.8)
Failed merges:
- #123349 (Fix capture analysis for by-move closure bodies)
r? `@ghost`
`@rustbot` modify labels: rollup
|
|
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with https://github.com/rust-lang/rust/commit/8811efa88b25b5e41d63850e6047e8257c677858#diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113 -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
|
|
This is just one part of the MCP, but it's the one that IMHO removes the most noise from the standard library code.
Seems net simpler this way, since MIR already supported heterogeneous shifts anyway, and thus it's not more work for backends than before.
|
|
Codegen const panic messages as function calls
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting shared library (see [perf]).
A sample improvement from nightly:
```
leaq str.0(%rip), %rdi
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdx
movl $25, %esi
callq *_ZN4core9panicking5panic17h17cabb89c5bcc999E@GOTPCREL(%rip)
```
to this PR:
```
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdi
callq *_RNvNtNtCsduqIKoij8JB_4core9panicking11panic_const23panic_const_div_by_zero@GOTPCREL(%rip)
```
[perf]: https://perf.rust-lang.org/compare.html?start=a7e4de13c1785819f4d61da41f6704ed69d5f203&end=64fbb4f0b2d621ff46d559d1e9f5ad89a8d7789b&stat=instructions:u
|
|
Unbox and unwrap the contents of `StatementKind::Coverage`
The payload of coverage statements was historically a structure with several fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace `Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid accidentally bloating it in the future.
``@rustbot`` label +A-code-coverage
|
|
|
|
CFI: Strip auto traits off Virtual calls
We already use `Instance` at declaration sites when available to glean additional information about possible abstractions of the type in use. This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it generates type information for indirect calls through `Virtual` instances.
This is needed for the "separate machinery" version of my approach to the vtable issues (#122573), because we need to respond differently to a `Virtual` call to the same type as a non-virtual call, specifically [stripping auto traits off the receiver's `Self`](https://github.com/rust-lang/rust/pull/122573/commits/54b15b0c36d4638167732a0542ed0e34ecf17d7c) because there isn't a separate vtable for `Foo` vs `Foo + Send`.
This would also make a more general underlying mechanism that could be used by rcvalle's [proposed drop detection / encoding](https://github.com/rust-lang/rust/pull/116404/commits/edcd1e20a1a69a8590d8ca47b31634854a40e3fb) if we end up using his approach, as we could condition out on the `def_id` in the CFI code rather than requiring the generating code to explicitly note whether it was calling drop.
|
|
We already use `Instance` at declaration sites when available to glean
additional information about possible abstractions of the type in use.
This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it
generates type information for indirect calls through `Virtual`
instances.
|
|
library
|
|
Let codegen decide when to `mem::swap` with immediates
Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.
Thus this PR introduces a new `typed_swap` intrinsic with a fallback body, and replaces that fallback implementation when swapping immediates or scalar pairs.
r? oli-obk
Replaces #111744, and means we'll never need more libs PRs like #111803 or #107140
|
|
|
|
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
|
|
Remove `TypeAndMut` from `ty::RawPtr` variant, make it take `Ty` and `Mutability`
Pretty much mechanically converting `ty::RawPtr(ty::TypeAndMut { ty, mutbl })` to `ty::RawPtr(ty, mutbl)` and its fallout.
r? lcnr
cc rust-lang/types-team#124
|
|
"Handle" calls to upstream monomorphizations in compiler_builtins
This is pretty cooked, but I think it works.
compiler-builtins has a long-standing problem that at link time, its rlib cannot contain any calls to `core`. And yet, in codegen we _love_ inserting calls to symbols in `core`, generally from various panic entrypoints.
I intend this PR to attack that problem as completely as possible. When we generate a function call, we now check if we are generating a function call from `compiler_builtins` and whether the callee is a function which was not lowered in the current crate, meaning we will have to link to it.
If those conditions are met, actually generating the call is asking for a linker error. So we don't. If the callee diverges, we lower to an abort with the same behavior as `core::intrinsics::abort`. If the callee does not diverge, we produce an error. This means that compiler-builtins can contain panics, but they'll SIGILL instead of panicking. I made non-diverging calls a compile error because I'm guessing that they'd mostly get into compiler-builtins by someone making a mistake while working on the crate, and compile errors are better than linker errors. We could turn such calls into aborts as well if that's preferred.
|
|
|
|
|
|
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting binary.
|
|
|
|
implementing it
|
|
Remove fixme about LLVM basic block naming
~This may be a small perf win.~
Originally, this PR implemented the fixme, but it didn't have any measurable perf improvement.
r? ``@ghost``
|
|
cases that used `None`
|
|
Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.
So introduce a new `typed_swap` intrinsic with a fallback body, but replace that implementation for immediates and scalar pairs.
|
|
Previously, we did this slightly incorrectly for return values, and
didn't do it at all for arguments.
|
|
|
|
Avoiding the naming didn't have any meaningful perf impact.
|
|
|
|
Copy byval argument to alloca if alignment is insufficient
Fixes #122211
"Ignore whitespace" recommended.
|
|
add test ensuring simd codegen checks don't run when a static assertion failed
stdarch relies on this to ensure that SIMD indices are in bounds.
I would love to know why this works, but I can't figure out where codegen decides to not codegen a function if a required-const does not evaluate. `@oli-obk` `@bjorn3` do you have any idea?
|
|
Avoid lowering code under dead SwitchInt targets
The objective of this PR is to detect and eliminate code which is guarded by an `if false`, even if that `false` is a constant which is not known until monomorphization, or is `intrinsics::debug_assertions()`.
The effect of this is that we generate no LLVM IR the standard library's unsafe preconditions, when they are compiled in a build where they should be immediately optimized out. This mono-time optimization ensures that builds which disable debug assertions do not grow a linkage requirement against `core`, which compiler-builtins currently needs: https://github.com/rust-lang/rust/issues/121552
This revives the codegen side of https://github.com/rust-lang/rust/pull/91222 as planned in https://github.com/rust-lang/rust/issues/120848.
|
|
Only generate a ptrtoint in AtomicPtr codegen when absolutely necessary
This special case was added in this PR: https://github.com/rust-lang/rust/pull/77611 in response to this error message:
```
Intrinsic has incorrect argument type!
void ({}*)* `@llvm.ppc.cfence.p0sl_s`
in function rust_oom
LLVM ERROR: Broken function found, compilation aborted!
[RUSTC-TIMING] std test:false 20.161
error: could not compile `std`
```
But when I tried searching for more information about that intrinsic I found this: https://github.com/llvm/llvm-project/issues/55983 which is a report of someone hitting this same error and a fix was landed in LLVM, 2 years after the above Rust PR.
|
|
|