| Age | Commit message (Collapse) | Author | Lines |
|
|
|
See conversation in <https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/Is.20transmuting.20a.20.60T.60.20to.20.60Tx1.60.20.28one-element.20SIMD.20vector.29.20UB.3F/near/526262799>.
|
|
FCP completed in tracking issue #133962.
|
|
|
|
|
|
|
|
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics
Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics.
These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19.
I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something.
r? `@scottmcm`
|
|
Today we're making LLVM do a bunch of extra work for every enum you match on, even trivial stuff like `Option<bool>`. Let's not.
|
|
Don't re-`assume` in `transmute`s that don't change niches
I noticed in nightly 2025-02-21 that `transmute` is emitting way more `assume`s than necessary for newtypes.
For example, the three transmutes in <https://rust.godbolt.org/z/fW1KaTc4o> emits
```rust
define noundef range(i32 1, 0) i32 `@repeatedly_transparent_transmute(i32` noundef range(i32 1, 0) %_1) unnamed_addr {
start:
%0 = sub i32 %_1, 1
%1 = icmp ule i32 %0, -2
call void `@llvm.assume(i1` %1)
%2 = sub i32 %_1, 1
%3 = icmp ule i32 %2, -2
call void `@llvm.assume(i1` %3)
%4 = sub i32 %_1, 1
%5 = icmp ule i32 %4, -2
call void `@llvm.assume(i1` %5)
%6 = sub i32 %_1, 1
%7 = icmp ule i32 %6, -2
call void `@llvm.assume(i1` %7)
%8 = sub i32 %_1, 1
%9 = icmp ule i32 %8, -2
call void `@llvm.assume(i1` %9)
%10 = sub i32 %_1, 1
%11 = icmp ule i32 %10, -2
call void `@llvm.assume(i1` %11)
ret i32 %_1
}
```
But those are all just newtypes that don't change size or niches, so none of it's needed.
After this PR it's down to just
```rust
define noundef range(i32 1, 0) i32 `@repeatedly_transparent_transmute(i32` noundef range(i32 1, 0) %_1) unnamed_addr {
start:
ret i32 %_1
}
```
because none of those `assume`s in the original actually did anything.
(Transmuting to something with a difference niche, though, still has the assumes -- the other tests continue to pass checking that.)
|
|
Rather than needing to use `switch` for them to include the `unreachable` arm
|
|
Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding
LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.
|
|
minicore makes it much easier to add new language items to all of the
existing `no_core` tests.
|
|
|
|
Emit getelementptr inbounds nuw for pointer::add()
Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative.
Fixes https://github.com/rust-lang/rust/issues/137217.
|
|
intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic
LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that.
Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35`
try-job: test-various
|
|
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information
- Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
|
|
|
|
x86: use SSE2 to pass float and SIMD types
This builds on the new X86Sse2 ABI landed in https://github.com/rust-lang/rust/pull/137037 to actually make it a separate ABI from the default x86 ABI, and use SSE2 registers. Specifically, we use it in two ways: to return `f64` values in a register rather than by-ptr, and to pass vectors of size up to 128bit in a register (or, well, whatever LLVM does when passing `<4 x float>` by-val, I don't actually know if this ends up in a register).
Cc `@workingjubilee`
Fixes #133611
try-job: aarch64-apple
try-job: aarch64-gnu
try-job: aarch64-gnu-debug
try-job: test-various
try-job: x86_64-gnu-nopt
try-job: dist-i586-gnu-i586-i686-musl
try-job: x86_64-msvc-1
|
|
|
|
improve cold_path()
#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely`
However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this:
```
if let Some(x) = y { ... }
```
may generate 3-target switch:
```
switch y.discriminator:
0 => true branch
1 = > false branch
_ => unreachable
```
and therefore marking a branch as cold will have no effect.
This PR improves `cold_path()` to work with arbitrary switch instructions.
Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
|
|
|
|
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
|
|
|
|
intrinsic
|
|
|
|
|
|
|
|
For ZSTs there is no selection that needs to take place, so assert that
no `select` statement is emitted.
|
|
[1] mentions that having a single test with `-Zmerge-functions=disabled`
is preferable to having two separate tests. Apply that to the new
`select_unpredicatble` test here.
[1]: https://github.com/rust-lang/rust/pull/133964#issuecomment-2569693325
|
|
LLVM 20 adds nuw to GEP operations in this code, tolerate them.
|
|
stabilize const_swap
libs-api FCP passed in https://github.com/rust-lang/rust/issues/83163.
However, I only just realized that this actually involves an intrinsic. The intrinsic could be implemented entirely with existing stable const functionality, but we choose to make it a primitive to be able to detect more UB. So nominating for `@rust-lang/lang` to make sure they are aware; I leave it up to them whether they want to FCP this.
While at it I also renamed the intrinsic to make the "nonoverlapping" constraint more clear.
Fixes #83163
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Add `select_unpredictable` to force LLVM to use CMOV
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.
|
|
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
|
|
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs
into branches if it comes from a `select` marked with an `unpredictable`
metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits
such a `select` and uses it in the implementation of `binary_search_by`.
|
|
Except for `simd-intrinsic/`, which has a lot of files containing
multiple types like `u8x64` which really are better when hand-formatted.
There is a surprising amount of two-space indenting in this directory.
Non-trivial changes:
- `rustfmt::skip` needed in `debug-column.rs` to preserve meaning of the
test.
- `rustfmt::skip` used in a few places where hand-formatting read more
nicely: `enum/enum-match.rs`
- Line number adjustments needed for the expected output of
`debug-column.rs` and `coroutine-debug.rs`.
|
|
|
|
|
|
|
|
Stop using LLVM struct types for alloca
The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation.
Split out from #121577.
r? `@ghost`
|
|
Dellvmize some intrinsics (use `u32` instead of `Self` in some integer intrinsics)
This implements https://github.com/rust-lang/compiler-team/issues/693 minus what was implemented in #123226.
Note: I decided to _not_ change `shl`/... builder methods, as it just doesn't seem worth it.
r? ``@scottmcm``
|
|
|
|
|