| Age | Commit message (Collapse) | Author | Lines |
|
subtree-update_cg_gcc_2025-05-14
|
|
|
|
|
|
simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors
It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first...
Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`.
I have extremely low confidence in the GCC part of this PR.
See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)
|
|
|
|
subtree-update_cg_gcc_2025-04-18
|
|
|
|
remove `simd_fpow` and `simd_fpowi`
Discussed in https://github.com/rust-lang/rust/issues/137555
These functions are not exposed from `std::intrinsics::simd`, and not used anywhere outside of the compiler. They also don't lower to particularly good code at least on the major ISAs (I checked x86_64, aarch64, s390x, powerpc), where the vector is just spilled to the stack and scalar functions are used for the actual logic.
r? `@RalfJung`
|
|
|
|
intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic
LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that.
Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35`
try-job: test-various
|
|
field in `LayoutData`.
Also update comments that refered to BackendRepr::Uninhabited.
|
|
things which are already immediates
That means it stops trying to truncate things that are already `i1`s.
|
|
|
|
|
|
intrinsic
|
|
Combined with [1], this will change the overflowing multiplication
operations to return an `extern "C"`-safe type.
Link: https://github.com/rust-lang/compiler-builtins/pull/735 [1]
|
|
|
|
subtree-update_cg_gcc_2025_01_12
|
|
Signed-off-by: acceptacross <csqcqs@gmail.com>
|
|
|
|
the behavior of the type system not only depends on the current
assumptions, but also the currentnphase of the compiler. This is
mostly necessary as we need to decide whether and how to reveal
opaque types. We track this via the `TypingMode`.
|
|
|
|
|
|
|
|
Add intrinsics `fmuladd{f16,f32,f64,f128}`. This computes `(a * b) +
c`, to be fused if the code generator determines that (i) the target
instruction set has support for a fused operation, and (ii) that the
fused operation is more efficient than the equivalent, separate pair
of `mul` and `add` instructions.
https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
MIRI support is included for f32 and f64.
The codegen_cranelift uses the `fma` function from libc, which is a
correct implementation, but without the desired performance semantic. I
think this requires an update to cranelift to expose a suitable
instruction in its IR.
I have not tested with codegen_gcc, but it should behave the same
way (using `fma` from libc).
|
|
|
|
|
|
|
|
|
|
Supertraits of `BuilderMethods` are all called `XyzBuilderMethods`.
Supertraits of `CodegenMethods` are all called `XyzMethods`. This commit
changes the latter to `XyzCodegenMethods`, for consistency.
|
|
They both are part of `BuilderMethods`, and so should have `Builder` in
their name like all the other traits in `BuilderMethods`.
|
|
|
|
array)
|
|
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
|
|
|
|
|
|
|
|
|
|
Stop using LLVM struct types for alloca
The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation.
Split out from #121577.
r? `@ghost`
|
|
|
|
|
|
`OperandRef` does)
|
|
We already use `Instance` at declaration sites when available to glean
additional information about possible abstractions of the type in use.
This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it
generates type information for indirect calls through `Virtual`
instances.
|
|
|
|
subtree-update_cg_gcc_2024-03-05
|
|
|
|
|
|
fast-math flags
|
|
|
|
|