about summary refs log tree commit diff
path: root/tests/codegen
AgeCommit message (Collapse)AuthorLines
2023-03-14ICE when checking LocalInfo on runtime MIR.Camille GILLOT-4/+4
2023-03-13Rollup merge of #109081 - krasimirgg:llvm-17-simd-wide-sum, r=nikicMatthias Krüger-1/+1
simd-wide-sum test: adapt for LLVM 17 codegen change After https://github.com/llvm/llvm-project/commit/0d4a709bb876824a0afa5f86e138e8ffdcaf7661 LLVM becomes more clever and turns ```@wider_reduce_loop``` into an alias: https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/17806#0186da6b-582c-46bf-a227-1565fa0859ac/743-766 This adapts the test to prevent this.
2023-03-13simd-wide-sum test: adapt for LLVM 17 codegen changeKrasimir Georgiev-1/+1
After https://github.com/llvm/llvm-project/commit/0d4a709bb876824a0afa5f86e138e8ffdcaf7661 LLVM becomes more clever and turns `@wider_reduce_loop` into an alias: https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/17806#0186da6b-582c-46bf-a227-1565fa0859ac/743-766 This adapts the test to prevent this.
2023-03-13Auto merge of #108623 - scottmcm:try-different-as-slice-impl, r=the8472bors-5/+13
Move `Option::as_slice` to an always-sound implementation This approach depends on CSE to not have any branches or selects when the guessed offset is correct -- which it always will be right now -- but to also be *sound* (just less efficient) if the layout algorithms change such that the guess is incorrect. The codegen test confirms that CSE handles this as expected, leaving the optimal codegen. cc JakobDegen #108545
2023-03-11Move `Option::as_slice` to an always-sound implementationScott McMurray-5/+13
This approach depends on CSE to not have any branches or selects when the guessed offset is correct -- which it always will be right now -- but to also be *sound* (just less efficient) if the layout algorithms change such that the guess is incorrect.
2023-03-07Auto merge of #108763 - scottmcm:indexing-nuw-lengths, r=cuviperbors-0/+35
Use `nuw` when calculating slice lengths from `Range`s An `assume` would definitely not be worth it, but since the flag is almost free we might as well tell LLVM this, especially on `_unchecked` calls where there's no obvious way for it to deduce it. (Today neither safe nor unsafe indexing gets it: <https://rust.godbolt.org/z/G1jYT548s>)
2023-03-05Use `nuw` when calculating slice lengths from `Range`sScott McMurray-0/+35
An `assume` would definitely not be worth it, but since the flag is almost free we might as well tell LLVM this, especially on `_unchecked` calls where there's no obvious way for it to deduce it. (Today neither safe nor unsafe indexing gets it: <https://rust.godbolt.org/z/G1jYT548s>)
2023-03-05Auto merge of #108157 - scottmcm:tuple-gt-via-partialcmp, r=dtolnaybors-0/+121
Use `partial_cmp` to implement tuple `lt`/`le`/`ge`/`gt` In today's implementation, `(A, B)::gt` contains calls to *both* `A::eq` *and* `A::gt`. That's fine for primitives, but for things like `String`s it's kinda weird -- `(String, usize)::gt` has a call to both `bcmp` and `memcmp` (<https://rust.godbolt.org/z/7jbbPMesf>) because when `bcmp` says the `String`s aren't equal, it turns around and calls `memcmp` to find out which one's bigger. This PR changes the implementation to instead implement `(A, …, C, Z)::gt` using `A::partial_cmp`, `…::partial_cmp`, `C::partial_cmp`, and `Z::gt`. (And analogously for `lt`, `le`, and `ge`.) That way expensive comparisons don't need to be repeated. Technically this is an observable change on stable, so I've marked it `needs-fcp` + `T-libs-api` and will r? rust-lang/libs-api I'm hoping that this will be non-controversial, however, since it's very similar to the observable changes that were made to the derives (#81384 #98655) -- like those, this only changes behaviour if a type overrode behaviour in a way inconsistent with the rules for the various traits involved. (The first commit here is #108156, adding the codegen test, which I used to make sure this doesn't regress behaviour for primitives.) Zulip conversation about this change: <https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/.60.3E.60.20on.20Tuples/near/328392927>.
2023-03-02Auto merge of #106673 - flba-eb:add_qnx_nto_stdlib, r=workingjubileebors-0/+1
Add support for QNX Neutrino to standard library This change: - adds standard library support for QNX Neutrino (7.1). - upgrades `libc` to version `0.2.139` which supports QNX Neutrino `@gh-tr` ⚠️ Backtraces on QNX require https://github.com/rust-lang/backtrace-rs/pull/507 which is not yet merged! (But everything else works without these changes) ⚠️ Tested mainly with a x86_64 virtual machine (see qnx-nto.md) and partially with an aarch64 hardware (some tests fail due to constrained resources).
2023-03-01Auto merge of #108483 - scottmcm:unify-bytewise-eq-traits, r=the8472bors-2/+84
Merge two different equality specialization traits in `core` Arrays and slices each had their own version of this, without a matching set of `impl`s. Merge them into one (still-`pub(crate)`) `cmp::BytewiseEq` trait, so we can stop doing all these things twice. And that means that the `[T]::eq` → `memcmp` specialization picks up a bunch of types where that previously only worked for arrays, so examples like <https://rust.godbolt.org/z/KjsG8MGGT> will use it now instead of emitting loops. r? the8472
2023-03-01Merge two different equality specialization traits in `core`Scott McMurray-2/+84
2023-03-01Auto merge of #108446 - Zoxc:named-allocs, r=oli-obkbors-3/+3
Name LLVM anonymous constants by a hash of their contents This makes the names stable between different versions of a crate unlike the `AllocId` naming, making LLVM IR comparisons with `llvm-diff` more practical.
2023-03-01Add `Option::as_slice`(`_mut`)Andre Bogus-0/+28
This adds the following functions: * `Option<T>::as_slice(&self) -> &[T]` * `Option<T>::as_slice_mut(&mut self) -> &[T]` The `as_slice` and `as_slice_mut` functions benefit from an optimization that makes them completely branch-free. Note that the optimization's soundness hinges on the fact that either the niche optimization makes the offset of the `Some(_)` contents zero or the mempory layout of `Option<T>` is equal to that of `Option<MaybeUninit<T>>`.
2023-02-28Add QNX Neutrino support to libstdFlorian Bartels-0/+1
Co-authored-by: gh-tr <troach@qnx.com>
2023-02-25Update testsJohn Kåre Alsaker-3/+3
2023-02-19Auto merge of #107921 - cjgillot:codegen-overflow-check, r=tmiaskobors-0/+14
Make codegen choose whether to emit overflow checks ConstProp and DataflowConstProp currently have a specific code path not to propagate constants when they overflow. This is meant to have the correct behaviour when inlining from a crate with overflow checks (like `core`) into a crate compiled without. This PR shifts the behaviour change to the `Assert(Overflow*)` MIR terminators: if the crate is compiled without overflow checks, just skip emitting the assertions. This is already what happens with `OverflowNeg`. This allows ConstProp and DataflowConstProp to transform `CheckedBinaryOp(Add, u8::MAX, 1)` into `const (0, true)`, and let codegen ignore the `true`. The interpreter is modified to conform to this behaviour. Fixes #35310
2023-02-18Fix codegen test.Camille GILLOT-1/+1
2023-02-18Add codegen test.Camille GILLOT-0/+14
2023-02-18Make dyn* have the same scalar pair ABI as corresponding fat pointerMichael Goulet-1/+3
2023-02-18Add codegen testMichael Goulet-0/+7
2023-02-18Auto merge of #99679 - repnop:kernel-address-sanitizer, r=cuviperbors-0/+47
Add `kernel-address` sanitizer support for freestanding targets This PR adds support for KASan (kernel address sanitizer) instrumentation in freestanding targets. I included the minimal set of `x86_64-unknown-none`, `riscv64{imac, gc}-unknown-none-elf`, and `aarch64-unknown-none` but there's likely other targets it can be added to. (`linux_kernel_base.rs`?) KASan uses the address sanitizer attributes but has the `CompileKernel` parameter set to `true` in the pass creation.
2023-02-16Use `partial_cmp` to implement tuple `lt`/`le`/`ge`/`gt`Scott McMurray-2/+5
2023-02-16Add a codegen test for comparisons of 2-tuples of primitivesScott McMurray-0/+118
The operators are all overridden in full for tuples, so those parts pass easily, but they're worth pinning. Going via `Ord::cmp`, though, doesn't optimize away for anything but `cmp`+`is_le`. So this leaves `FIXME`s in the tests for the others.
2023-02-16Auto merge of #107449 - saethlin:enable-copyprop, r=oli-obkbors-97/+94
Enable CopyProp r? `@tmiasko` `@rustbot` label +A-mir-opt
2023-02-14Add `kernel-address` sanitizer support for freestanding targetsWesley Norris-0/+47
2023-02-14Try to fix codegen tests for ??? LLVM 14 ???Ben Kimock-5/+5
2023-02-14Fix codegen testsBen Kimock-94/+94
2023-02-14Rollup merge of #107573 - cuviper:drop-llvm-13, r=nagisaMatthias Krüger-5/+0
Update the minimum external LLVM to 14 With this change, we'll have stable support for LLVM 14 through 16 (pending release). For reference, the previous increase to LLVM 13 was #100460.
2023-02-13Auto merge of #107634 - scottmcm:array-drain, r=thomccbors-1/+61
Improve the `array::map` codegen The `map` method on arrays [is documented as sometimes performing poorly](https://doc.rust-lang.org/std/primitive.array.html#note-on-performance-and-stack-usage), and after [a question on URLO](https://users.rust-lang.org/t/try-trait-residual-o-trait-and-try-collect-into-array/88510?u=scottmcm) prompted me to take another look at the core [`try_collect_into_array`](https://github.com/rust-lang/rust/blob/7c46fb2111936ad21a8e3aa41e9128752357f5d8/library/core/src/array/mod.rs#L865-L912) function, I had some ideas that ended up working better than I'd expected. There's three main ideas in here, split over three commits: 1. Don't use `array::IntoIter` when we can avoid it, since that seems to not get SRoA'd, meaning that every step writes things like loop counters into the stack unnecessarily 2. Don't return arrays in `Result`s unnecessarily, as that doesn't seem to optimize away even with `unwrap_unchecked` (perhaps because it needs to get moved into a new LLVM type to account for the discriminant) 3. Don't distract LLVM with all the `Option` dances when we know for sure we have enough items (like in `map` and `zip`). This one's a larger commit as to do it I ended up adding a new `pub(crate)` trait, but hopefully those changes are still straight-forward. (No libs-api changes; everything should be completely implementation-detail-internal.) It's still not completely fixed -- I think it needs pcwalton's `memcpy` optimizations still (#103830) to get further -- but this seems to go much better than before. And the remaining `memcpy`s are just `transmute`-equivalent (`[T; N] -> ManuallyDrop<[T; N]>` and `[MaybeUninit<T>; N] -> [T; N]`), so hopefully those will be easier to remove with LLVM16 than the previous subobject copies 🤞 r? `@thomcc` As a simple example, this test ```rust pub fn long_integer_map(x: [u32; 64]) -> [u32; 64] { x.map(|x| 13 * x + 7) } ``` On nightly <https://rust.godbolt.org/z/xK7548TGj> takes `sub rsp, 808` ```llvm start: %array.i.i.i.i = alloca [64 x i32], align 4 %_3.sroa.5.i.i.i = alloca [65 x i32], align 4 %_5.i = alloca %"core::iter::adapters::map::Map<core::array::iter::IntoIter<u32, 64>, [closure@/app/example.rs:2:11: 2:14]>", align 8 ``` (and yes, that's a 6**5**-element array `alloca` despite 6**4**-element input and output) But with this PR it's only `sub rsp, 520` ```llvm start: %array.i.i.i.i.i.i = alloca [64 x i32], align 4 %array1.i.i.i = alloca %"core::mem::manually_drop::ManuallyDrop<[u32; 64]>", align 4 ``` Similarly, the loop it emits on nightly is scalar-only and horrifying ```nasm .LBB0_1: mov esi, 64 mov edi, 0 cmp rdx, 64 je .LBB0_3 lea rsi, [rdx + 1] mov qword ptr [rsp + 784], rsi mov r8d, dword ptr [rsp + 4*rdx + 528] mov edi, 1 lea edx, [r8 + 2*r8] lea r8d, [r8 + 4*rdx] add r8d, 7 .LBB0_3: test edi, edi je .LBB0_11 mov dword ptr [rsp + 4*rcx + 272], r8d cmp rsi, 64 jne .LBB0_6 xor r8d, r8d mov edx, 64 test r8d, r8d jne .LBB0_8 jmp .LBB0_11 .LBB0_6: lea rdx, [rsi + 1] mov qword ptr [rsp + 784], rdx mov edi, dword ptr [rsp + 4*rsi + 528] mov r8d, 1 lea esi, [rdi + 2*rdi] lea edi, [rdi + 4*rsi] add edi, 7 test r8d, r8d je .LBB0_11 .LBB0_8: mov dword ptr [rsp + 4*rcx + 276], edi add rcx, 2 cmp rcx, 64 jne .LBB0_1 ``` whereas with this PR it's unrolled and vectorized ```nasm vpmulld ymm1, ymm0, ymmword ptr [rsp + 64] vpaddd ymm1, ymm1, ymm2 vmovdqu ymmword ptr [rsp + 328], ymm1 vpmulld ymm1, ymm0, ymmword ptr [rsp + 96] vpaddd ymm1, ymm1, ymm2 vmovdqu ymmword ptr [rsp + 360], ymm1 ``` (though sadly still stack-to-stack)
2023-02-12Enable CopyProp by default, tune the impl a bitBen Kimock-4/+1
2023-02-10Update the minimum external LLVM to 14Josh Stone-5/+0
2023-02-10Rollup merge of #107043 - Nilstrieb:true-and-false-is-false, r=wesleywiserMatthias Krüger-1/+1
Support `true` and `false` as boolean flag params Implements [MCP 577](https://github.com/rust-lang/compiler-team/issues/577).
2023-02-09Test XRay only for supported targetsOleksii Lozovskyi-0/+3
Now that the compiler accepts "-Z instrument-xray" option only when targeting one of the supported targets, make sure to not run the codegen tests where the compiler will fail. Like with other compiletests, we don't have access to internals, so simply hardcode a list of supported architectures here.
2023-02-09Codegen tests for -Z instrument-xrayOleksii Lozovskyi-0/+29
Let's add at least some tests to verify that this option is accepted and produces expected LLVM attributes. More tests can be added later with attribute support.
2023-02-06also do not add noalias on not-Unpin BoxRalf Jung-2/+8
2023-02-06make &mut !Unpin not dereferenceableRalf Jung-1/+15
See https://github.com/rust-lang/unsafe-code-guidelines/issues/381 for discussion.
2023-02-06make PointerKind directly reflect pointer typesRalf Jung-0/+6
The code that consumes PointerKind (`adjust_for_rust_scalar` in rustc_ty_utils) ended up using PointerKind variants to talk about Rust reference types (& and &mut) anyway, making the old code structure quite confusing: one always had to keep in mind which PointerKind corresponds to which type. So this changes PointerKind to directly reflect the type. This does not change behavior.
2023-02-04Add another autovectorization codegen test using array zip-mapScott McMurray-1/+12
2023-02-04Allow canonicalizing the `array::map` loop in trusted casesScott McMurray-1/+3
2023-02-04Stop forcing `array::map` through an unnecessary `Result`Scott McMurray-3/+2
2023-02-04Stop using `into_iter` in `array::map`Scott McMurray-0/+48
2023-01-28Rollup merge of #107373 - ↵Matthias Krüger-0/+8
michaelwoerister:dont-merge-vtables-when-debuginfo, r=WaffleLapkin Don't merge vtables when full debuginfo is enabled. This PR makes the compiler not emit the `unnamed_addr` attribute for vtables when full debuginfo is enabled, so that they don't get merged even if they have the same contents. This allows debuggers to more reliably map from a dyn pointer to the self-type of a trait object by looking at the vtable's debuginfo. The PR only changes the behavior of the LLVM backend as other backends don't emit vtable debuginfo (as far as I can tell). The performance impact of this change should be small as [measured](https://github.com/rust-lang/rust/pull/103514#issuecomment-1290833854) in a previous PR.
2023-01-28Rollup merge of #107022 - scottmcm:ordering-option-eq, r=m-ou-seMatthias Krüger-0/+10
Implement `SpecOptionPartialEq` for `cmp::Ordering` Noticed as I continue to explore options for having code using `partial_cmp` optimize better. Before: ```llvm ; Function Attrs: mustprogress nofree nosync nounwind willreturn uwtable define noundef zeroext i1 `@ordering_eq(i8` noundef %0, i8 noundef %1) unnamed_addr #0 { start: %2 = icmp eq i8 %0, 2 br i1 %2, label %bb1.i, label %bb3.i bb1.i: ; preds = %start %3 = icmp eq i8 %1, 2 br label %"_ZN55_$LT$T$u20$as$u20$core..option..SpecOptionPartialEq$GT$2eq17hb7e7beacecde585fE.exit" bb3.i: ; preds = %start %.not.i = icmp ne i8 %1, 2 %4 = icmp eq i8 %0, %1 %spec.select.i = and i1 %.not.i, %4 br label %"_ZN55_$LT$T$u20$as$u20$core..option..SpecOptionPartialEq$GT$2eq17hb7e7beacecde585fE.exit" "_ZN55_$LT$T$u20$as$u20$core..option..SpecOptionPartialEq$GT$2eq17hb7e7beacecde585fE.exit": ; preds = %bb1.i, %bb3.i %.0.i = phi i1 [ %3, %bb1.i ], [ %spec.select.i, %bb3.i ] ret i1 %.0.i } ``` After: ```llvm ; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone willreturn uwtable define noundef zeroext i1 `@ordering_eq(i8` noundef %0, i8 noundef %1) unnamed_addr #1 { start: %2 = icmp eq i8 %0, %1 ret i1 %2 } ``` (Which <https://alive2.llvm.org/ce/z/-rop5r> says LLVM *could* just do itself, but there's probably an issue already open for that problem from when this was originally looked at for `Option<NonZeroU8>` and friends.)
2023-01-27Don't merge vtables when full debuginfo is enabled.Michael Woerister-0/+8
2023-01-22abi: add `AddressSpace` field to `Primitive::Pointer`Erik Desjardins-0/+25
...and remove it from `PointeeInfo`, which isn't meant for this. There are still various places (marked with FIXMEs) that assume all pointers have the same size and alignment. Fixing this requires parsing non-default address spaces in the data layout string, which will be done in a followup.
2023-01-19Auto merge of #106989 - clubby789:is-zero-num, r=scottmcmbors-0/+17
Implement `alloc::vec::IsZero` for `Option<$NUM>` types Fixes #106911 Mirrors the `NonZero$NUM` implementations with an additional `assert_zero_valid`. `None::<i32>` doesn't stricly satisfy `IsZero` but for the purpose of allocating we can produce more efficient codegen.
2023-01-18Implement `SpecOptionPartialEq` for `cmp::Ordering`Scott McMurray-0/+10
2023-01-18Support `true` and `false` as boolean flag paramsNilstrieb-1/+1
Implements MCP 577.
2023-01-18Implement `alloc::vec::IsZero` for `Option<$NUM>` typesclubby789-0/+17
2023-01-18Rollup merge of #106995 - lukas-code:align_offset_assembly_test, r=cuviperMatthias Krüger-1/+1
bump failing assembly & codegen tests from LLVM 14 to LLVM 15 These tests need LLVM 15. Found by ```@Robert-Cunningham``` in https://github.com/rust-lang/rust/pull/100601#issuecomment-1385400008 Passed tests at 006506e93fc80318ebfd7939fe1fd4dc19ecd8cb in https://github.com/rust-lang/rust/actions/runs/3942442730/jobs/6746104740.