about summary refs log tree commit diff
path: root/tests/codegen
AgeCommit message (Collapse)AuthorLines
2024-03-03Auto merge of #121665 - erikdesjardins:ptradd, r=nikicbors-6/+49
Always generate GEP i8 / ptradd for struct offsets This implements #98615, and goes a bit further to remove `struct_gep` entirely. Upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](https://github.com/llvm/llvm-project/pull/68882) all constant-offset GEPs to i8, which has roughly the same effect as this change. Fixes #121719. Split out from #121577. r? `@nikic`
2024-03-01Add initial support for DataFlowSanitizerRamon de C Valle-0/+10
Adds initial support for DataFlowSanitizer to the Rust compiler. It currently supports `-Zsanitizer-dataflow-abilist`. Additional options for it can be passed to LLVM command line argument processor via LLVM arguments using `llvm-args` codegen option (e.g., `-Cllvm-args=-dfsan-combine-pointer-labels-on-load=false`).
2024-02-29Rollup merge of #120820 - CKingX:cpu-base-minimum, r=petrochenkov,ChrisDentonGuillaume Gomez-1/+1
Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics (in nightly) in Windows x64 As Rust plans to set Windows 10 as the minimum supported OS for target x86_64-pc-windows-msvc, I have added the cmpxchg16b and sse3 feature. Windows 10 requires CMPXCHG16B, LAHF/SAHF, and PrefetchW as stated in the requirements [here](https://download.microsoft.com/download/c/1/5/c150e1ca-4a55-4a7e-94c5-bfc8c2e785c5/Windows%2010%20Minimum%20Hardware%20Requirements.pdf). Furthermore, CPUs that meet these requirements also have SSE3 ([see](https://walbourn.github.io/directxmath-sse3-and-ssse3/))
2024-02-29Rollup merge of #121700 - ↵Guillaume Gomez-0/+25
rcvalle:rust-cfi-dont-compress-user-defined-builtin-types, r=compiler-errors CFI: Don't compress user-defined builtin types Doesn't compress user-defined builtin types (see https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin and https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-compression).
2024-02-27test merging of multiple match branches that access fields of the same offsetErik Desjardins-0/+44
2024-02-27use non-inbounds GEP for ZSTs, add fixmesErik Desjardins-3/+3
2024-02-27CFI: Don't compress user-defined builtin typesRamon de C Valle-0/+25
Doesn't compress user-defined builtin types (see https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin and https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-compression).
2024-02-26always use gep inbounds i8 (ptradd) for field offsetsErik Desjardins-7/+6
2024-02-27Auto merge of #121655 - matthiaskrgr:rollup-qpx3kks, r=matthiaskrgrbors-1/+1
Rollup of 4 pull requests Successful merges: - #121598 (rename 'try' intrinsic to 'catch_unwind') - #121639 (Update books) - #121648 (Update Vec and String `{from,into}_raw_parts`-family docs) - #121651 (Properly emit `expected ;` on `#[attr] expr`) r? `@ghost` `@rustbot` modify labels: rollup
2024-02-27Rollup merge of #121598 - RalfJung:catch_unwind, r=oli-obkMatthias Krüger-1/+1
rename 'try' intrinsic to 'catch_unwind' The intrinsic has nothing to do with `try` blocks, and corresponds to the stable `catch_unwind` function, so this makes a lot more sense IMO. Also rename Miri's special function while we are at it, to reflect the level of abstraction it works on: it's an unwinding mechanism, on which Rust implements panics.
2024-02-26Auto merge of #121516 - RalfJung:platform-intrinsics-begone, r=oli-obkbors-49/+49
remove platform-intrinsics ABI; make SIMD intrinsics be regular intrinsics `@Amanieu` `@workingjubilee` I don't think there is any reason these need to be "special"? The [original RFC](https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html) indicated eventually making them stable, but I think that is no longer the plan, so seems to me like we can clean this up a bit. Blocked on https://github.com/rust-lang/stdarch/pull/1538, https://github.com/rust-lang/rust/pull/121542.
2024-02-26rename 'try' intrinsic to 'catch_unwind'Ralf Jung-1/+1
2024-02-26Rollup merge of #120656 - Zalathar:filecheck-flags, r=wesleywiserGuillaume Gomez-0/+159
Allow tests to specify a `//@ filecheck-flags:` header This allows individual codegen/assembly/mir-opt tests to pass extra flags to the LLVM `filecheck` tool as needed. --- The original motivation was noticing that `tests/run-make/instrument-coverage` was very close to being an ordinary codegen test, except that it needs some extra logic to set up platform-specific variables to be passed into filecheck. I then saw the comment in `verify_with_filecheck` indicating that a `filecheck-flags` header might be useful for other purposes as well.
2024-02-25Use generic `NonZero` in tests.Markus Reiter-50/+46
2024-02-25fix use of platform_intrinsics in testsRalf Jung-49/+49
2024-02-25Auto merge of #120650 - clubby789:switchint-const, r=saethlinbors-0/+67
Use `br` instead of a conditional when switching on a constant boolean r? `@ghost`
2024-02-23Ignore less tests in debug buildsBen Kimock-54/+7
2024-02-23Use `br` instead of conditional when branching on constantclubby789-0/+67
2024-02-23Remove unhelpful `DEFINE_INTERNAL` from filecheck flagsZalathar-3/+2
This define was copied over from the run-make version of the test, but doesn't seem to serve any useful purpose.
2024-02-23Convert `tests/run-make/instrument-coverage` to an ordinary codegen testZalathar-0/+121
This test was already very close to being an ordinary codegen test, except that it needed some extra logic to set a few variables based on (target) platform characteristics. Now that we have support for `//@ filecheck-flags:`, we can instead set those variables using the normal test revisions mechanism.
2024-02-23Move existing coverage codegen tests into a subdirectoryZalathar-0/+0
This makes room for migrating over `tests/run-make/instrument-coverage`, without increasing the number of top-level items in the codegen test directory.
2024-02-23Allow tests to specify a `//@ filecheck-flags:` headerZalathar-0/+8
Any flags specified here will be passed to LLVM's `filecheck` tool, in tests that use that tool.
2024-02-23Add some simple meta-tests for the handling of `filecheck` flagsZalathar-0/+31
2024-02-22[AUTO_GENERATED] Migrate compiletest to use `ui_test`-style `//@` directives许杰友 Jieyou Xu (Joe)-1027/+1027
2024-02-22Auto merge of #121225 - RalfJung:simd-extract-insert-const-idx, ↵bors-47/+0
r=oli-obk,Amanieu require simd_insert, simd_extract indices to be constants As discussed in https://github.com/rust-lang/rust/issues/77477 (see in particular [here](https://github.com/rust-lang/rust/issues/77477#issuecomment-703149102)). This PR doesn't touch codegen yet -- the first step is to ensure that the indices are always constants; the second step is to then make use of this fact in backends. Blocked on https://github.com/rust-lang/stdarch/pull/1530 propagating to the rustc repo.
2024-02-21remove simd_reduce_{min,max}_nanlessRalf Jung-1/+1
2024-02-21Auto merge of #120718 - saethlin:reasonable-fast-math, r=nnethercotebors-0/+22
Add "algebraic" fast-math intrinsics, based on fast-math ops that cannot return poison Setting all of LLVM's fast-math flags makes our fast-math intrinsics very dangerous, because some inputs are UB. This set of flags permits common algebraic transformations, but according to the [LangRef](https://llvm.org/docs/LangRef.html#fastmath), only the flags `nnan` (no nans) and `ninf` (no infs) can produce poison. And this uses the algebraic float ops to fix https://github.com/rust-lang/rust/issues/120720 cc `@orlp`
2024-02-20Add "algebraic" versions of the fast-math intrinsicsBen Kimock-0/+22
2024-02-20delete a test that no longer makes senseRalf Jung-47/+0
2024-02-19Updated test to account for added previous features (thanks erikdesjardins!)CKingX-1/+1
2024-02-18Auto merge of #118264 - lukas-code:optimized-draining, r=the8472bors-0/+69
Optimize `VecDeque::drain` for (half-)open ranges The most common use cases of `VecDeque::drain` consume either the entire queue or elements from the front or back.[^1] This PR makes these operations faster by optimizing the generated code of the destructor of the drain: * `.drain(..)` is now the same as `.clear()`. * `.drain(n..)` is now (almost[^2]) the same as `.truncate(n)`. * `.drain(..n)` is now an efficient "advance" function. This operation is not provided by a dedicated function and optimizing it is my main motivation for this PR. Previously, all of these cases generated a function call to the destructor of the `DropGuard`, emitting a lot of unused machine code as well as unnecessary branches and loads/stores of stack variables. There are no algorithmic changes in this PR, but it simplifies the code enough to allow LLVM to recognize the special cases and optimize accordingly. Most notably, it allows elimination of the rather large [`wrap_copy`] function. Some [rudimentary microbenchmarks][benches] show a performance improvement of **~3x-4x** on my machine for the special cases and roughly equal performance for the general case. Best reviewed commit by commit. [^1]: source: GitHub code search: [full range `drain(..)` = 7.5k results][full], [from front `drain(..n)` = 3.2k results][front], [from back `drain(n..)` = 1.6k results][back], [from middle `drain(n..m)` = <500 results][middle] [^2]: `.drain(0..)` and `.clear()` reset the head to 0, but `.truncate(0)` does not. [full]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%280%3F%5C.%5C.%5C%29%2F+lang%3ARust [front]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%280%3F%5C.%5C.%5B%5E%29%5D.*%5C%29%2F+lang%3ARust [back]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%28%5B%5E0%5D.*%5C.%5C.%5C%29%2F+lang%3ARust [middle]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%28%5B%5E0%5D.*%5C.%5C.%5B%5E%29%5D.*%5C%29%2F+lang%3ARust [`wrap_copy`]: https://github.com/rust-lang/rust/blob/4fd68eb47bad1c121417ac4450b2f0456150db86/library/alloc/src/collections/vec_deque/mod.rs#L262-L391 [benches]: https://gist.github.com/lukas-code/c97bd707d074c4cc31f241edbc7fd2a2 <details> <summary>generated assembly</summary> before: ```asm clear: sub rsp, 40 mov rax, qword ptr [rdi + 24] mov qword ptr [rdi + 24], 0 mov qword ptr [rsp], rdi mov qword ptr [rsp + 8], rax xorps xmm0, xmm0 movups xmmword ptr [rsp + 16], xmm0 mov qword ptr [rsp + 32], rax test rax, rax je .LBB1_2 mov rcx, qword ptr [rdi] mov rdx, qword ptr [rdi + 16] xor esi, esi cmp rdx, rcx cmovae rsi, rcx sub rdx, rsi mov rsi, rcx sub rsi, rdx lea rdi, [rdx + rax] cmp rsi, rax cmovb rdi, rcx sub rdi, rdx mov qword ptr [rsp + 16], rdi mov qword ptr [rsp + 32], 0 .LBB1_2: mov rdi, rsp call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>> add rsp, 40 ret truncate: mov rax, qword ptr [rdi + 24] sub rax, rsi jbe .LBB2_2 sub rsp, 40 mov qword ptr [rdi + 24], rsi mov qword ptr [rsp], rdi mov qword ptr [rsp + 8], rax mov rcx, qword ptr [rdi] mov rdx, qword ptr [rdi + 16] add rdx, rsi xor edi, edi cmp rdx, rcx cmovae rdi, rcx mov qword ptr [rsp + 24], 0 sub rdx, rdi mov rdi, rcx sub rdi, rdx lea r8, [rdx + rax] cmp rdi, rax cmovb r8, rcx sub rsi, rdx add rsi, r8 mov qword ptr [rsp + 16], rsi mov qword ptr [rsp + 32], 0 mov rdi, rsp call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>> add rsp, 40 advance: mov rcx, qword ptr [rdi + 24] mov rax, rcx sub rax, rsi jbe .LBB3_1 sub rsp, 40 mov qword ptr [rdi + 24], 0 mov qword ptr [rsp], rdi mov qword ptr [rsp + 8], rsi mov qword ptr [rsp + 16], 0 mov qword ptr [rsp + 24], rax mov qword ptr [rsp + 32], rsi test rsi, rsi je .LBB3_6 mov rax, qword ptr [rdi] mov rcx, qword ptr [rdi + 16] xor edx, edx cmp rcx, rax cmovae rdx, rax sub rcx, rdx mov rdx, rax sub rdx, rcx lea rdi, [rcx + rsi] cmp rdx, rsi cmovb rdi, rax sub rdi, rcx mov qword ptr [rsp + 16], rdi mov qword ptr [rsp + 32], 0 .LBB3_6: mov rdi, rsp call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>> add rsp, 40 ret .LBB3_1: test rcx, rcx je .LBB3_3 mov qword ptr [rdi + 24], 0 .LBB3_3: mov qword ptr [rdi + 16], 0 ret remove: sub rsp, 40 cmp rdx, rsi jb .LBB4_5 mov rax, qword ptr [rdi + 24] mov rcx, rax sub rcx, rdx jb .LBB4_6 mov qword ptr [rdi + 24], rsi mov qword ptr [rsp], rdi sub rdx, rsi mov qword ptr [rsp + 8], rdx mov qword ptr [rsp + 16], rsi mov qword ptr [rsp + 24], rcx mov qword ptr [rsp + 32], rdx je .LBB4_4 mov rax, qword ptr [rdi] mov rcx, qword ptr [rdi + 16] add rcx, rsi xor edi, edi cmp rcx, rax cmovae rdi, rax sub rcx, rdi mov rdi, rax sub rdi, rcx lea r8, [rcx + rdx] cmp rdi, rdx cmovb r8, rax sub rsi, rcx add rsi, r8 mov qword ptr [rsp + 16], rsi mov qword ptr [rsp + 32], 0 .LBB4_4: mov rdi, rsp call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>> add rsp, 40 ret .LBB4_5: lea rax, [rip + .L__unnamed_2] mov rdi, rsi mov rsi, rdx mov rdx, rax call qword ptr [rip + core::slice::index::slice_index_order_fail@GOTPCREL] .LBB4_6: lea rcx, [rip + .L__unnamed_2] mov rdi, rdx mov rsi, rax mov rdx, rcx call qword ptr [rip + core::slice::index::slice_end_index_len_fail@GOTPCREL] core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>>: push rbp push r15 push r14 push r13 push r12 push rbx sub rsp, 24 mov rsi, qword ptr [rdi + 32] test rsi, rsi je .LBB0_2 mov rax, qword ptr [rdi + 16] add rsi, rax jb .LBB0_45 .LBB0_2: mov r13, qword ptr [rdi] mov rbp, qword ptr [rdi + 8] mov rbx, qword ptr [r13 + 24] lea r12, [rbx + rbp] mov r15, qword ptr [rdi + 24] lea rsi, [r15 + r12] test rbx, rbx je .LBB0_10 test r15, r15 je .LBB0_42 cmp rbx, r15 jbe .LBB0_12 mov r14, qword ptr [r13] mov rax, qword ptr [r13 + 16] add r12, rax xor ecx, ecx cmp r12, r14 mov rdx, r14 cmovb rdx, rcx sub r12, rdx add rbx, rax cmp rbx, r14 cmovae rcx, r14 sub rbx, rcx mov rcx, rbx sub rcx, r12 je .LBB0_42 mov rdi, qword ptr [r13 + 8] mov rax, rcx add rax, r14 cmovae rax, rcx mov r8, r14 sub r8, r12 mov rcx, r14 sub rcx, rbx mov rdx, r15 sub rdx, r8 mov qword ptr [rsp + 16], rsi jbe .LBB0_18 cmp rax, r15 jae .LBB0_24 mov rdx, r15 sub rdx, r8 shl rdx, 2 cmp r15, rcx jbe .LBB0_30 sub r8, rcx mov qword ptr [rsp], rdi mov rax, qword ptr [rsp] lea rdi, [rax + 4*r8] mov rsi, qword ptr [rsp] mov qword ptr [rsp + 8], rcx mov r15, r8 call qword ptr [rip + memmove@GOTPCREL] sub r14, r15 mov rax, qword ptr [rsp] lea rsi, [rax + 4*r14] shl r15, 2 mov rdi, qword ptr [rsp] mov rdx, r15 call qword ptr [rip + memmove@GOTPCREL] mov rdi, qword ptr [rsp] lea rsi, [rdi + 4*r12] lea rdi, [rdi + 4*rbx] mov r15, qword ptr [rsp + 8] jmp .LBB0_36 .LBB0_10: test r15, r15 je .LBB0_17 mov rax, qword ptr [r13] sub rsi, rbp add rbp, qword ptr [r13 + 16] xor ecx, ecx cmp rbp, rax cmovae rcx, rax sub rbp, rcx mov qword ptr [r13 + 16], rbp jmp .LBB0_43 .LBB0_12: mov rdx, qword ptr [r13 + 16] mov r15, qword ptr [r13] lea rax, [rdx + rbp] xor ecx, ecx cmp rax, r15 cmovae rcx, r15 mov r12, rax sub r12, rcx mov rcx, r12 sub rcx, rdx je .LBB0_41 mov rdi, qword ptr [r13 + 8] mov rax, rcx add rax, r15 cmovae rax, rcx mov r8, r15 sub r8, rdx mov rcx, r15 sub rcx, r12 mov r14, rbx sub r14, r8 mov qword ptr [rsp + 16], rsi jbe .LBB0_21 cmp rax, rbx jae .LBB0_26 mov qword ptr [rsp], rdx mov rdx, rbx sub rdx, r8 shl rdx, 2 cmp rbx, rcx jbe .LBB0_32 sub r8, rcx mov rbx, rdi lea rdi, [rdi + 4*r8] mov rsi, rbx mov qword ptr [rsp + 8], rcx mov r14, r8 call qword ptr [rip + memmove@GOTPCREL] sub r15, r14 lea rsi, [rbx + 4*r15] shl r14, 2 mov rdi, rbx mov rdx, r14 call qword ptr [rip + memmove@GOTPCREL] mov rdi, rbx mov rax, qword ptr [rsp] lea rsi, [rbx + 4*rax] lea rdi, [rbx + 4*r12] mov rbx, qword ptr [rsp + 8] jmp .LBB0_40 .LBB0_17: xorps xmm0, xmm0 movups xmmword ptr [r13 + 16], xmm0 jmp .LBB0_44 .LBB0_18: mov r14, r15 sub r14, rcx jbe .LBB0_28 cmp rax, r15 jae .LBB0_33 lea rax, [rcx + r12] sub r15, rcx lea rsi, [rdi + 4*rax] shl r15, 2 mov r14, rdi mov rdx, r15 mov r15, rcx jmp .LBB0_31 .LBB0_21: mov r14, rbx sub r14, rcx jbe .LBB0_29 cmp rax, rbx jae .LBB0_34 lea rax, [rcx + rdx] sub rbx, rcx lea rsi, [rdi + 4*rax] shl rbx, 2 mov r14, rdi mov r15, rdx mov rdx, rbx mov rbx, rcx call qword ptr [rip + memmove@GOTPCREL] mov rdi, r14 lea rsi, [r14 + 4*r15] lea rdi, [r14 + 4*r12] jmp .LBB0_40 .LBB0_24: sub r15, rcx jbe .LBB0_35 sub rcx, r8 mov qword ptr [rsp + 8], rcx lea rsi, [rdi + 4*r12] mov r12, rdi lea rdi, [rdi + 4*rbx] lea rdx, [4*r8] mov r14, r8 call qword ptr [rip + memmove@GOTPCREL] add r14, rbx lea rdi, [r12 + 4*r14] mov rbx, qword ptr [rsp + 8] lea rdx, [4*rbx] mov rsi, r12 call qword ptr [rip + memmove@GOTPCREL] mov rdi, r12 lea rsi, [r12 + 4*rbx] jmp .LBB0_36 .LBB0_26: sub rbx, rcx jbe .LBB0_37 sub rcx, r8 lea rsi, [rdi + 4*rdx] mov r15, rdi lea rdi, [rdi + 4*r12] lea rdx, [4*r8] mov r14, rcx mov qword ptr [rsp], r8 call qword ptr [rip + memmove@GOTPCREL] add r12, qword ptr [rsp] lea rdi, [r15 + 4*r12] lea rdx, [4*r14] mov rsi, r15 call qword ptr [rip + memmove@GOTPCREL] mov rdi, r15 lea rsi, [r15 + 4*r14] jmp .LBB0_40 .LBB0_28: lea rsi, [rdi + 4*r12] lea rdi, [rdi + 4*rbx] jmp .LBB0_36 .LBB0_29: lea rsi, [rdi + 4*rdx] lea rdi, [rdi + 4*r12] jmp .LBB0_40 .LBB0_30: lea rax, [r8 + rbx] mov r14, rdi lea rdi, [rdi + 4*rax] mov rsi, r14 mov r15, r8 .LBB0_31: call qword ptr [rip + memmove@GOTPCREL] mov rdi, r14 lea rsi, [r14 + 4*r12] lea rdi, [r14 + 4*rbx] jmp .LBB0_36 .LBB0_32: lea rax, [r12 + r8] mov rbx, rdi lea rdi, [rdi + 4*rax] mov rsi, rbx mov r14, r8 call qword ptr [rip + memmove@GOTPCREL] mov rdi, rbx mov rax, qword ptr [rsp] lea rsi, [rbx + 4*rax] jmp .LBB0_38 .LBB0_33: lea rsi, [rdi + 4*r12] mov r15, rdi lea rdi, [rdi + 4*rbx] lea rdx, [4*rcx] mov rbx, rcx call qword ptr [rip + memmove@GOTPCREL] mov rdi, r15 add rbx, r12 lea rsi, [r15 + 4*rbx] mov r15, r14 jmp .LBB0_36 .LBB0_34: lea rsi, [rdi + 4*rdx] mov rbx, rdi lea rdi, [rdi + 4*r12] mov r15, rdx lea rdx, [4*rcx] mov r12, rcx call qword ptr [rip + memmove@GOTPCREL] mov rdi, rbx add r12, r15 lea rsi, [rbx + 4*r12] jmp .LBB0_39 .LBB0_35: lea rsi, [rdi + 4*r12] mov r14, rdi lea rdi, [rdi + 4*rbx] mov r12, rdx lea rdx, [4*r8] mov r15, r8 call qword ptr [rip + memmove@GOTPCREL] add r15, rbx mov rsi, r14 lea rdi, [r14 + 4*r15] mov r15, r12 .LBB0_36: shl r15, 2 mov rdx, r15 call qword ptr [rip + memmove@GOTPCREL] mov rsi, qword ptr [rsp + 16] jmp .LBB0_42 .LBB0_37: lea rsi, [rdi + 4*rdx] mov rbx, rdi lea rdi, [rdi + 4*r12] lea rdx, [4*r8] mov r15, r8 call qword ptr [rip + memmove@GOTPCREL] add r12, r15 mov rsi, rbx .LBB0_38: lea rdi, [rbx + 4*r12] .LBB0_39: mov rbx, r14 .LBB0_40: shl rbx, 2 mov rdx, rbx call qword ptr [rip + memmove@GOTPCREL] mov r15, qword ptr [r13] mov rax, qword ptr [r13 + 16] add rax, rbp mov rsi, qword ptr [rsp + 16] .LBB0_41: xor ecx, ecx cmp rax, r15 cmovae rcx, r15 sub rax, rcx mov qword ptr [r13 + 16], rax .LBB0_42: sub rsi, rbp .LBB0_43: mov qword ptr [r13 + 24], rsi .LBB0_44: add rsp, 24 pop rbx pop r12 pop r13 pop r14 pop r15 pop rbp ret .LBB0_45: lea rdx, [rip + .L__unnamed_1] mov rdi, rax call qword ptr [rip + core::slice::index::slice_index_order_fail@GOTPCREL] ``` after: ```asm clear: movups xmmword ptr [rdi + 16], xmm0 ret truncate: cmp qword ptr [rdi + 24], rsi jbe .LBB2_4 test rsi, rsi jne .LBB2_3 mov qword ptr [rdi + 16], 0 .LBB2_3: mov qword ptr [rdi + 24], rsi .LBB2_4: ret advance: mov rcx, qword ptr [rdi + 24] mov rax, rcx sub rax, rsi jbe .LBB3_1 mov rcx, qword ptr [rdi] add rsi, qword ptr [rdi + 16] xor edx, edx cmp rsi, rcx cmovae rdx, rcx sub rsi, rdx mov qword ptr [rdi + 16], rsi mov qword ptr [rdi + 24], rax ret .LBB3_1: test rcx, rcx je .LBB3_3 mov qword ptr [rdi + 24], 0 .LBB3_3: mov qword ptr [rdi + 16], 0 ret remove: push rbp push r15 push r14 push r13 push r12 push rbx push rax mov r15, rsi mov r14, rdx sub r14, rsi jb .LBB4_9 mov rbx, rdi mov r12, qword ptr [rdi + 24] mov r13, r12 sub r13, rdx jb .LBB4_10 mov qword ptr [rbx + 24], r15 mov rbp, r12 sub rbp, r14 test r15, r15 je .LBB4_4 cmp rbp, r15 jne .LBB4_11 .LBB4_4: cmp r12, r14 jne .LBB4_6 .LBB4_5: mov qword ptr [rbx + 16], 0 jmp .LBB4_8 .LBB4_11: mov rdi, rbx mov rsi, r14 mov rdx, r15 mov rcx, r13 call <<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<T,A> as core::ops::drop::Drop>::drop::copy_data cmp r12, r14 je .LBB4_5 .LBB4_6: cmp r13, r15 jbe .LBB4_8 mov rax, qword ptr [rbx] add r14, qword ptr [rbx + 16] xor ecx, ecx cmp r14, rax cmovae rcx, rax sub r14, rcx mov qword ptr [rbx + 16], r14 .LBB4_8: mov qword ptr [rbx + 24], rbp add rsp, 8 pop rbx pop r12 pop r13 pop r14 pop r15 pop rbp ret .LBB4_9: lea rax, [rip + .L__unnamed_1] mov rdi, r15 mov rsi, rdx mov rdx, rax call qword ptr [rip + core::slice::index::slice_index_order_fail@GOTPCREL] .LBB4_10: lea rax, [rip + .L__unnamed_1] mov rdi, rdx mov rsi, r12 mov rdx, rax call qword ptr [rip + core::slice::index::slice_end_index_len_fail@GOTPCREL] <<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<T,A> as core::ops::drop::Drop>::drop::copy_data: push rbp push r15 push r14 push r13 push r12 push rbx push rax mov r14, rsi cmp rdx, rcx jae .LBB0_1 mov r12, qword ptr [rdi] mov rax, qword ptr [rdi + 16] add r14, rax xor ecx, ecx cmp r14, r12 cmovae rcx, r12 sub r14, rcx mov r15, rdx mov r13, r14 mov r14, rax mov rcx, r13 sub rcx, r14 je .LBB0_18 .LBB0_4: mov rdi, qword ptr [rdi + 8] mov rax, rcx add rax, r12 cmovae rax, rcx mov rbx, r12 sub rbx, r14 mov rcx, r12 sub rcx, r13 mov rbp, r15 sub rbp, rbx jbe .LBB0_5 cmp rax, r15 jae .LBB0_12 mov rdx, r15 sub rdx, rbx shl rdx, 2 cmp r15, rcx jbe .LBB0_16 sub rbx, rcx mov rbp, rdi lea rdi, [rdi + 4*rbx] mov r15, qword ptr [rip + memmove@GOTPCREL] mov rsi, rbp mov qword ptr [rsp], rcx call r15 sub r12, rbx lea rsi, [4*r12] add rsi, rbp shl rbx, 2 mov rdi, rbp mov rdx, rbx call r15 mov rdi, rbp lea rsi, [4*r14] add rsi, rbp lea rdi, [4*r13] add rdi, rbp mov r15, qword ptr [rsp] jmp .LBB0_7 .LBB0_1: mov r15, rcx add r14, rdx mov r12, qword ptr [rdi] mov r13, qword ptr [rdi + 16] add r14, r13 xor eax, eax cmp r14, r12 mov rcx, r12 cmovb rcx, rax sub r14, rcx add r13, rdx cmp r13, r12 cmovae rax, r12 sub r13, rax mov rcx, r13 sub rcx, r14 jne .LBB0_4 .LBB0_18: add rsp, 8 pop rbx pop r12 pop r13 pop r14 pop r15 pop rbp ret .LBB0_5: mov rbx, r15 sub rbx, rcx jbe .LBB0_6 cmp rax, r15 jae .LBB0_9 lea rax, [rcx + r14] sub r15, rcx lea rsi, [rdi + 4*rax] shl r15, 2 mov rbx, rdi mov rdx, r15 mov r15, rcx call qword ptr [rip + memmove@GOTPCREL] mov rdi, rbx lea rsi, [rbx + 4*r14] lea rdi, [rbx + 4*r13] jmp .LBB0_7 .LBB0_12: sub r15, rcx jbe .LBB0_13 sub rcx, rbx lea rsi, [rdi + 4*r14] mov r12, rdi lea rdi, [rdi + 4*r13] lea rdx, [4*rbx] mov r14, qword ptr [rip + memmove@GOTPCREL] mov rbp, rcx call r14 add rbx, r13 lea rdi, [r12 + 4*rbx] lea rdx, [4*rbp] mov rsi, r12 call r14 mov rdi, r12 lea rsi, [r12 + 4*rbp] jmp .LBB0_7 .LBB0_6: lea rsi, [rdi + 4*r14] lea rdi, [rdi + 4*r13] jmp .LBB0_7 .LBB0_16: lea rax, [rbx + r13] mov r15, rdi lea rdi, [rdi + 4*rax] mov rsi, r15 call qword ptr [rip + memmove@GOTPCREL] mov rdi, r15 lea rsi, [r15 + 4*r14] lea rdi, [r15 + 4*r13] mov r15, rbx jmp .LBB0_7 .LBB0_9: lea rsi, [rdi + 4*r14] mov r15, rdi lea rdi, [rdi + 4*r13] lea rdx, [4*rcx] mov r12, rcx call qword ptr [rip + memmove@GOTPCREL] mov rdi, r15 add r12, r14 lea rsi, [r15 + 4*r12] mov r15, rbx jmp .LBB0_7 .LBB0_13: lea rsi, [rdi + 4*r14] mov r14, rdi lea rdi, [rdi + 4*r13] lea rdx, [4*rbx] call qword ptr [rip + memmove@GOTPCREL] add rbx, r13 mov rsi, r14 lea rdi, [r14 + 4*rbx] mov r15, rbp .LBB0_7: shl r15, 2 mov rdx, r15 add rsp, 8 pop rbx pop r12 pop r13 pop r14 pop r15 pop rbp jmp qword ptr [rip + memmove@GOTPCREL] ``` </details>
2024-02-16Don't use mem::zeroed in vec::IntoIterBen Kimock-2/+15
2024-02-16add codegen testLukas Markeffsky-0/+69
2024-02-16Auto merge of #120500 - oli-obk:intrinsics2.0, r=WaffleLapkinbors-2/+2
Implement intrinsics with fallback bodies fixes #93145 (though we can port many more intrinsics) cc #63585 The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for * codegen_ssa (so llvm and gcc) * codegen_cranelift other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body). cc `@scottmcm` `@WaffleLapkin` ### todo * [ ] miri support * [x] default intrinsic name to name of function instead of requiring it to be specified in attribute * [x] make sure that the bodies are always available (must be collected for metadata)
2024-02-13tests: LLVM 18 infers an extra noalias hereAugie Fackler-1/+1
This test started failing on LLVM 18 after change 61118ffd04aa6d1f9ee92daae4deb28bd975d4ab. As far as I can tell, it's just good fortune that LLVM is able to sniff out the new noalias here, and it's correct.
2024-02-12Support safe intrinsics with fallback bodiesOli Scherer-2/+2
Turn `is_val_statically_known` into such an intrinsic to demonstrate. It is perfectly safe to call after all.
2024-02-11Rollup merge of #118307 - scottmcm:tuple-eq-simpler, r=joshtriplettMatthias Krüger-6/+8
Remove an unneeded helper from the tuple library code Thanks to https://github.com/rust-lang/rust/pull/107022, this is just what `==` does, so we don't need the helper here anymore.
2024-02-09Build DebugInfo for coroutine-closureMichael Goulet-0/+21
2024-02-07Rollup merge of #119162 - heiher:direct-access-external-data, r=petrochenkovGuillaume Boisseau-0/+21
Add unstable `-Z direct-access-external-data` cmdline flag for `rustc` The new flag has been described in the Major Change Proposal at https://github.com/rust-lang/compiler-team/issues/707 Fixes #118053
2024-02-06Rollup merge of #120502 - clubby789:remove-ffi-returns-twice, r=compiler-errorsMatthias Krüger-11/+0
Remove `ffi_returns_twice` feature The [tracking issue](https://github.com/rust-lang/rust/issues/58314) and [RFC](https://github.com/rust-lang/rfcs/pull/2633) have been closed for a couple of years. There is also an attribute gate in R-A which should be removed if this lands.
2024-02-04Auto merge of #120624 - matthiaskrgr:rollup-3gvcl20, r=matthiaskrgrbors-0/+38
Rollup of 8 pull requests Successful merges: - #120484 (Avoid ICE when is_val_statically_known is not of a supported type) - #120516 (pattern_analysis: cleanup manual impls) - #120517 (never patterns: It is correct to lower `!` to `_`.) - #120523 (Improve `io::Read::read_buf_exact` error case) - #120528 (Store SHOULD_CAPTURE as AtomicU8) - #120529 (Update data layouts in custom target tests for LLVM 18) - #120531 (Remove a bunch of `has_errors` checks that have no meaningful or the wrong effect) - #120533 (Correct paths for hexagon-unknown-none-elf platform doc) r? `@ghost` `@rustbot` modify labels: rollup
2024-02-03Rollup merge of #120484 - Teapot4195:issue-120480-fix, r=compiler-errorsMatthias Krüger-0/+38
Avoid ICE when is_val_statically_known is not of a supported type 2 ICE with 1 stone! 1. Implement `llvm.is.constant.ptr` to avoid first ICE in linked issue. 2. return `false` when the argument is not one of `i*`/`f*`/`ptr` to avoid second ICE. fixes #120480
2024-02-01Revert unsound libcore changes of #119911Oli Scherer-55/+0
2024-01-30Remove `ffi_returns_twice` featureclubby789-11/+0
2024-01-30Add additional test cases for is_val_statically_knownAlex Huang-0/+38
2024-01-30Rollup merge of #120310 - krasimirgg:jan-v0-sym, r=Mark-SimulacrumGuillaume Gomez-1/+1
adapt test for v0 symbol mangling No functional changes intended. Adapts the test to also work under `new-symbol-mangling = true`.
2024-01-26Update codegen test for LLVM 18Nikita Popov-2/+2
2024-01-25Auto merge of #119911 - NCGThompson:is-statically-known, r=oli-obkbors-0/+103
Replacement of #114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See #114390 for more discussion. While I have extended the scope of the power of two optimization from #114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests. Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage). Fixes #47234 Resolves #114390 `@Centri3`
2024-01-24adapt test for v0 symbol manglingKrasimir Georgiev-1/+1
No functional changes intended. Adapts the test to also work under new-symbol-mangling = true.
2024-01-23Further Implement Power of Two OptimizationNicholas Thompson-39/+26