| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
Check `x86_64` size assertions on `aarch64`, too
(Context: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Checking.20size.20assertions.20on.20aarch64.3F)
Currently the compiler has around 30 sets of `static_assert_size!` for various size-critical data structures (e.g. various IR nodes), guarded by `#[cfg(all(target_arch = "x86_64", target_pointer_width = "64"))]`.
(Presumably this cfg avoids having to maintain separate size values for 32-bit targets and unusual 64-bit targets. Apparently it may have been necessary before the i128/u128 alignment changes, too.)
This is slightly incovenient for people on aarch64 workstations (e.g. Macs), because the assertions normally aren't checked until we push to a PR. So this PR adds `aarch64` to the `#[cfg(..)]` guarding all of those assertions in the compiler.
---
Implemented with a simple find/replace. Verified by manually inspecting each `static_assert_size!` in `compiler/`, and checking that either the replacement succeeded, or adding aarch64 wouldn't have been appropriate.
|
|
|
|
This makes it easier for contributors on aarch64 workstations (e.g. Macs) to
notice when these assertions have been violated.
|
|
rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
|
|
Rollup of 8 pull requests
Successful merges:
- #123198 (Add fn const BuildHasherDefault::new)
- #123226 (De-LLVM the unchecked shifts [MCP#693])
- #123302 (Make sure to insert `Sized` bound first into clauses list)
- #123348 (rustdoc: add a couple of regression tests)
- #123362 (Check that nested statics in thread locals are duplicated per thread.)
- #123368 (CFI: Support non-general coroutines)
- #123375 (rustdoc: synthetic auto trait impls: accept unresolved region vars for now)
- #123378 (Update sysinfo to 0.30.8)
Failed merges:
- #123349 (Fix capture analysis for by-move closure bodies)
r? `@ghost`
`@rustbot` modify labels: rollup
|
|
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with https://github.com/rust-lang/rust/commit/8811efa88b25b5e41d63850e6047e8257c677858#diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113 -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
|
|
|
|
|
|
|
|
|
|
refactor check_{lang,library}_ub: use a single intrinsic
This enacts the plan I laid out [here](https://github.com/rust-lang/rust/pull/122282#issuecomment-1996917998): use a single intrinsic, called `ub_checks` (in aniticpation of https://github.com/rust-lang/compiler-team/issues/725), that just exposes the value of `debug_assertions` (consistently implemented in both codegen and the interpreter). Put the language vs library UB logic into the library.
This makes it easier to do something like https://github.com/rust-lang/rust/pull/122282 in the future: that just slightly alters the semantics of `ub_checks` (making it more approximating when crates built with different flags are mixed), but it no longer affects whether these checks can happen in Miri or compile-time.
The first commit just moves things around; I don't think these macros and functions belong into `intrinsics.rs` as they are not intrinsics.
r? `@saethlin`
|
|
library
|
|
Rollup of 11 pull requests
Successful merges:
- #120577 (Stabilize slice_split_at_unchecked)
- #122698 (Cancel `cargo update` job if there's no updates)
- #122780 (Rename `hir::Local` into `hir::LetStmt`)
- #122915 (Delay a bug if no RPITITs were found)
- #122916 (docs(sync): normalize dot in fn summaries)
- #122921 (Enable more mir-opt tests in debug builds)
- #122922 (-Zprint-type-sizes: print the types of awaitees and unnamed coroutine locals.)
- #122927 (Change an ICE regression test to use the original reproducer)
- #122930 (add panic location to 'panicked while processing panic')
- #122931 (Fix some typos in the pin.rs)
- #122933 (tag_for_variant follow-ups)
r? `@ghost`
`@rustbot` modify labels: rollup
|
|
Let codegen decide when to `mem::swap` with immediates
Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.
Thus this PR introduces a new `typed_swap` intrinsic with a fallback body, and replaces that fallback implementation when swapping immediates or scalar pairs.
r? oli-obk
Replaces #111744, and means we'll never need more libs PRs like #111803 or #107140
|
|
|
|
|
|
Rollup of 8 pull requests
Successful merges:
- #114009 (compiler: allow transmute of ZST arrays with generics)
- #122195 (Note that the caller chooses a type for type param)
- #122651 (Suggest `_` for missing generic arguments in turbofish)
- #122784 (Add `tag_for_variant` query)
- #122839 (Split out `PredicatePolarity` from `ImplPolarity`)
- #122873 (Merge my contributor emails into one using mailmap)
- #122885 (Adjust better spastorino membership to triagebot's adhoc_groups)
- #122888 (add a couple more tests)
r? `@ghost`
`@rustbot` modify labels: rollup
|
|
Add `tag_for_variant` query
This query allows for sharing code between `rustc_const_eval` and `rustc_transmutability`. It's a precursor to a PR I'm working on to entirely replace the bespoke layout computations in `rustc_transmutability`.
r? `@compiler-errors`
|
|
This query allows for sharing code between `rustc_const_eval` and
`rustc_transmutability`.
Also moves `DummyMachine` to `rustc_const_eval`.
|
|
|
|
|
|
interpret/allocation: fix aliasing issue in interpreter and refactor getters a bit
That new raw getter will be needed to let Miri pass pointers to natively executed FFI code ("extern-so" mode).
While doing that I realized our get_bytes_mut are named less scary than get_bytes_unchecked so I rectified that. Also I realized `mem_copy_repeatedly` would break if we called it for multiple overlapping copies so I made sure this does not happen.
And I realized that we are actually [violating Stacked Borrows in the interpreter](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/I.20think.20Miri.20violates.20Stacked.20Borrows.20.F0.9F.99.88).^^ That was introduced in https://github.com/rust-lang/rust/pull/87777.
r? ```@oli-obk```
|
|
various clippy fixes
We need to keep the order of the given clippy lint rules before passing them.
Since clap doesn't offer any useful interface for this purpose out of the box,
we have to handle it manually.
Additionally, this PR makes `-D` rules work as expected. Previously, lint rules were limited to `-W`. By enabling `-D`, clippy began to complain numerous lines in the tree, all of which have been resolved in this PR as well.
Fixes #121481
cc `@matthiaskrgr`
|
|
Signed-off-by: onur-ozkan <work@onurozkan.dev>
|
|
|
|
clean up `Sized` checking
This PR cleans up `sized_constraint` and related functions to make them simpler and faster. This should not make more or less code compile, but it can change error output in some rare cases.
## enums and unions are `Sized`, even if they are not WF
The previous code has some special handling for enums, which made them sized if and only if the last field of each variant is sized. For example given this definition (which is not WF)
```rust
enum E<T1: ?Sized, T2: ?Sized, U1: ?Sized, U2: ?Sized> {
A(T1, T2),
B(U1, U2),
}
```
the enum was sized if and only if `T2` and `U2` are sized, while `T1` and `T2` were ignored for `Sized` checking. After this PR this enum will always be sized.
Unsized enums are not a thing in Rust and removing this special case allows us to return an `Option<Ty>` from `sized_constraint`, rather than a `List<Ty>`.
Similarly, the old code made an union defined like this
```rust
union Union<T: ?Sized, U: ?Sized> {
head: T,
tail: U,
}
```
sized if and only if `U` is sized, completely ignoring `T`. This just makes no sense at all and now this union is always sized.
## apply the "perf hack" to all (non-error) types, instead of just type parameters
This "perf hack" skips evaluating `sized_constraint(adt): Sized` if `sized_constraint(adt): Sized` exactly matches a predicate defined on `adt`, for example:
```rust
// `Foo<T>: Sized` iff `T: Sized`, but we know `T: Sized` from a predicate of `Foo`
struct Foo<T /*: Sized */>(T);
```
Previously this was only applied to type parameters and now it is applied to every type. This means that for example this type is now always sized:
```rust
// Note that this definition is WF, but the type `S<T>` not WF in the global/empty ParamEnv
struct S<T>([T]) where [T]: Sized;
```
I don't anticipate this to affect compile time of any real-world program, but it makes the code a bit nicer and it also makes error messages a bit more consistent if someone does write such a cursed type.
## tuples are sized if the last type is sized
The old solver already has this behavior and this PR also implements it for the new solver and `is_trivially_sized`. This makes it so that tuples work more like a struct defined like this:
```rust
struct TupleN<T1, T2, /* ... */ Tn: ?Sized>(T1, T2, /* ... */ Tn);
```
This might improve the compile time of programs with large tuples a little, but is mostly also a consistency fix.
## `is_trivially_sized` for more types
This function is used post-typeck code (borrowck, const eval, codegen) to skip evaluating `T: Sized` in some cases. It will now return `true` in more cases, most notably `UnsafeCell<T>` and `ManuallyDrop<T>` where `T.is_trivially_sized`.
I'm anticipating that this change will improve compile time for some real world programs.
|
|
cases that used `None`
|
|
Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.
So introduce a new `typed_swap` intrinsic with a fallback body, but replace that implementation for immediates and scalar pairs.
|
|
|
|
a bit
- rename mutating functions to be more scary
- add a new raw bytes getter
|
|
|
|
|
|
know the `Machine` type
|
|
|
|
|
|
interpret: ensure that Place is never used for a different frame
We store the address where the stack frame stores its `locals`. The idea is that even if we pop and push, or switch to a different thread with a larger number of frames, then the `locals` address will most likely change so we'll notice that problem. This is made possible by some recent changes by `@WaffleLapkin,` where we no longer use `Place` across things that change the number of stack frames.
I made these debug assertions for now, just to make sure this can't cost us any perf.
The first commit is unrelated but it's a one-line comment change so it didn't warrant a separate PR...
r? `@oli-obk`
|
|
|
|
to address issue 121610.
|
|
miri: add some chance to reuse addresses of previously freed allocations
The hope is that this can help us find ABA issues.
Unfortunately this needs rustc changes so I can't easily run the regular benchmark suite. I used `src/tools/miri/tests/pass/float_nan.rs` as a substitute:
```
Before:
Benchmark 1: ./x.py run miri --stage 0 --args src/tools/miri/tests/pass/float_nan.rs --args --edition=2021
Time (mean ± σ): 9.570 s ± 0.013 s [User: 9.279 s, System: 0.290 s]
Range (min … max): 9.561 s … 9.579 s 2 runs
After:
Benchmark 1: ./x.py run miri --stage 0 --args src/tools/miri/tests/pass/float_nan.rs --args --edition=2021
Time (mean ± σ): 9.698 s ± 0.046 s [User: 9.413 s, System: 0.279 s]
Range (min … max): 9.666 s … 9.731 s 2 runs
```
That's a ~1.3% slowdown, which seems fine to me. I have seen a lot of noise in this style of benchmarking so I don't quite trust this anyway; we can make further experiments in the Miri repo after this migrated there.
r? `@oli-obk`
|
|
|
|
|
|
|
|
Will be used in the next commit
|
|
|
|
interpret: do not call machine read hooks during validation
Fixes https://github.com/rust-lang/miri/issues/3347
r? ``@oli-obk``
|
|
|
|
Distinguish between library and lang UB in assert_unsafe_precondition
As described in https://github.com/rust-lang/rust/pull/121583#issuecomment-1963168186, `assert_unsafe_precondition` now explicitly distinguishes between language UB (conditions we explicitly optimize on) and library UB (things we document you shouldn't do, and maybe some library internals assume you don't do).
`debug_assert_nounwind` was originally added to avoid the "only at runtime" aspect of `assert_unsafe_precondition`. Since then the difference between the macros has gotten muddied. This totally revamps the situation.
Now _all_ preconditions shall be checked with `assert_unsafe_precondition`. If you have a precondition that's only checkable at runtime, do a `const_eval_select` hack, as done in this PR.
r? RalfJung
|
|
|