about summary refs log tree commit diff
path: root/tests/codegen
AgeCommit message (Collapse)AuthorLines
2025-04-11Auto merge of #139430 - scottmcm:polymorphic-array-into-iter, r=cuviperbors-3/+32
Polymorphize `array::IntoIter`'s iterator impl Today we emit all the iterator methods for every different array width. That's wasteful since the actual array length never even comes into it -- the indices used are from the separate `alive: IndexRange` field, not even the `N` const param. This PR switches things so that an `array::IntoIter<T, N>` stores a `PolymorphicIter<[MaybeUninit<T>; N]>`, which we *unsize* to `PolymorphicIter<[MaybeUninit<T>]>` and call methods on that non-`Sized` type for all the iterator methods. That also necessarily makes the layout consistent between the different lengths of arrays, because of the unsizing. Compare that to today <https://rust.godbolt.org/z/Prb4xMPrb>, where different widths can't even be deduped because the offset to the indices is different for different array widths.
2025-04-11Auto merge of #139578 - ferrocene:pa-compiletest-edition, r=jieyouxubors-28/+39
Fix breakage when running compiletest with `--test-args=--edition=2015` Compiletest has an `--edition` flag to change the default edition tests are run with. Unfortunately no test suite successfully executes when that flag is passed. If the edition is set to something greater than 2015 the breakage is expected, since the test suite currently supports only edition 2015 (Ferrous Systems will open an MCP about fixing that soonish). Surprisingly, the test suite is also broken if `--edition=2015` is passed to compiletest. This PR focuses on fixing the latter. This PR fixes the two categories of failures happening when `--edition=2015` is passed: * Some edition-specific tests set their edition through `//@ compile-flags` instead of `//@ edition`. Compiletest doesn't parse the compile flags, so it would see no `//@ edition` and add another `--edition` flag, leading to a rustc error. * Compiletest would add the edition after `//@ compile-flags`, while some tests depend on flags passed to `//@ compile-flags` being the last flags in the rustc invocation. Note that for the first category, I opted to manually go and replace all `//@ compile-flags` setting an edition with an explicit `//@ edition`. We could've changed compiletest to instead check whether an edition was set in `//@ compile-flags`, but I thought it was better to enforce a consistent way to set the edition in tests. I also added the edition to the stamp, so that changing `--edition` results in tests being re-executed. r? `@jieyouxu`
2025-04-11didn't catch this test failure, whoopsPietro Albini-5/+5
2025-04-11Rollup merge of #137447 - folkertdev:simd-extract-insert-dyn, r=scottmcmStuart Cook-0/+75
add `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}` fixes https://github.com/rust-lang/rust/issues/137372 adds `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`, which contrary to their non-dyn counterparts allow a non-const index. Many platforms (but notably not x86_64 or aarch64) have dedicated instructions for this operation, which stdarch can emit with this change. Future work is to also make the `Index` operation on the `Simd` type emit this operation, but the intrinsic can't be used directly. We'll need some MIR shenanigans for that. r? `@ghost`
2025-04-10Auto merge of #137412 - scottmcm:redo-swap, r=cuviperbors-45/+118
Ensure `swap_nonoverlapping` is really always untyped This replaces #134954, which was arguably overcomplicated. ## Fixes #134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from #134689) ## Fixes #134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
2025-04-10add `simd_insert_dyn` and `simd_extract_dyn`Folkert de Vries-0/+75
2025-04-10Auto merge of #139088 - spastorino:ergonomic-ref-counting-2, r=nikomatsakisbors-0/+55
Ergonomic ref counting: optimize away clones when possible This PR build on top of https://github.com/rust-lang/rust/pull/134797. It optimizes codegen of ergonomic ref-counting when the type being `use`d is only known to be copy after monomorphization. We avoid codening a clone and generate bitwise copy instead. RFC: https://github.com/rust-lang/rfcs/pull/3680 Tracking issue: https://github.com/rust-lang/rust/issues/132290 Project goal: https://github.com/rust-lang/rust-project-goals/issues/107 r? `@nikomatsakis` This PR could better sit on top of https://github.com/rust-lang/rust/pull/131650 but as it did not land yet I've decided to just do minimal changes. It may be the case that doing what I'm doing regress the performance and we may need to go the full route of https://github.com/rust-lang/rust/pull/131650. cc `@saethlin` in this regard.
2025-04-10replace `//@ compile-flags: --edition` with `//@ edition`Pietro Albini-23/+34
2025-04-09PR feedbackScott McMurray-6/+12
2025-04-09skip `tests/codegen/swap-small-types` when debug assertions are onScott McMurray-0/+1
In `swap_nonoverlapping_short` there's a new `debug_assert!`, and if that's enabled then the `alloca`s don't optimize out.
2025-04-09Ensure `swap_nonoverlapping` is really always untypedScott McMurray-45/+117
2025-04-09Speed up `String::push` and `String::insert`lincot-0/+11
Improve performance of `String` methods by avoiding unnecessary memcpy for the character bytes, with added codegen check to ensure compliance.
2025-04-08Rollup merge of #139098 - scottmcm:assert-impossible-tags, r=WaffleLapkinStuart Cook-19/+456
Tell LLVM about impossible niche tags I was trying to find a better way of emitting discriminant calculations, but sadly had no luck. So here's a fairly small PR with the bits that did seem worth bothering: 1. As the [`TagEncoding::Niche` docs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/enum.TagEncoding.html#variant.Niche) describe, it's possible to end up with a dead value in the input that's not already communicated via the range parameter attribute nor the range load metadata attribute. So this adds an `llvm.assume` in non-debug mode to tell LLVM about that. (That way it can tell that the sides of the `select` have disjoint possible values.) 2. I'd written a bunch more tests, or at least made them parameterized, in the process of trying things out, so this checks in those tests to hopefully help future people not trip on the same weird edge cases, like when the tag type is `i8` but yet there's still a variant index and discriminant of `258` which doesn't fit in that tag type because the enum is really weird.
2025-04-07Address PR feedbackScott McMurray-0/+16
2025-04-07Add codegen test to be sure we get rid of uneeded clones after monomorphizationSantiago Pastorino-0/+55
2025-04-07Rollup merge of #139465 - EnzymeAD:autodiff-sret, r=oli-obkStuart Cook-0/+45
add sret handling for scalar autodiff r? `@oli-obk` Fixing one of the todo's which I left in my previous batching PR. This one handles sret for scalar autodiff. `sret` mostly shows up when we try to return a lot of scalar floats. People often start testing autodiff which toy functions which just use a few scalars as inputs and outputs, and those were the most likely to be affected by this issue. So this fix should make learning/teaching hopefully a bit easier. Tracking: - https://github.com/rust-lang/rust/issues/124509
2025-04-07move old tests, add sret testManuel Drehwald-0/+45
2025-04-06update/bless testsBennet Bleßmann-24/+19
2025-04-06Rollup merge of #139438 - Zalathar:fix-test-122600, r=scottmcmStuart Cook-0/+2
Prevent a test from seeing forbidden numbers in the rustc version The final CHECK-NOT directive in this test was able to see past the end of the enclosing function, and find the substring `753` or `754` in the git hash in the rustc version number, causing false failures in CI whenever the git hash happens to contain those digits in sequence. Adding an explicit check for `ret` prevents the CHECK-NOT directive from seeing past the end of the function. --- Manually tested by adding `// CHECK-NOT: rustc` after the existing CHECK-NOT directives, and demonstrating that the new check prevents it from seeing the rustc version string.
2025-04-05LLVM18 compatibility fixes in the testsScott McMurray-3/+6
2025-04-05Tell LLVM about impossible niche tagsScott McMurray-19/+437
2025-04-06Prevent a test from seeing forbidden numbers in the rustc versionZalathar-0/+2
The final CHECK-NOT directive in this test was able to see past the end of the enclosing function, and find the substring 753 or 754 in the git hash in the rustc version number, causing false failures in CI. Adding an explicit check for `ret` prevents the CHECK-NOT directive from seeing past the end of the function.
2025-04-05Polymorphize `array::IntoIter`'s iterator implScott McMurray-1/+24
2025-04-05Update the minimum external LLVM to 19Josh Stone-111/+14
2025-04-05Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwcoMatthias Krüger-0/+19
KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05KCFI: Add KCFI arity indicator supportRamon de C Valle-0/+19
Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05Rollup merge of #138024 - reitermarkus:unicode-panic-optimization, r=ibraheemdevStuart Cook-0/+14
Allow optimizing out `panic_bounds_check` in Unicode checks. Allow optimizing out `panic_bounds_check` in Unicode checks. For context, see https://github.com/japaric/ufmt/issues/52#issuecomment-2699207241.
2025-04-05Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obkStuart Cook-2/+118
Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283
2025-04-05Rollup merge of #136457 - calder:master, r=tgross35Stuart Cook-13/+207
Expose algebraic floating point intrinsics # Problem A stable Rust implementation of a simple dot product is 8x slower than C++ on modern x86-64 CPUs. The root cause is an inability to let the compiler reorder floating point operations for better vectorization. See https://github.com/calder/dot-bench for benchmarks. Measurements below were performed on a i7-10875H. ### C++: 10us ✅ With Clang 18.1.3 and `-O2 -march=haswell`: <table> <tr> <th>C++</th> <th>Assembly</th> </tr> <tr> <td> <pre lang="cc"> float dot(float *a, float *b, size_t len) { #pragma clang fp reassociate(on) float sum = 0.0; for (size_t i = 0; i < len; ++i) { sum += a[i] * b[i]; } return sum; } </pre> </td> <td> <img src="https://github.com/user-attachments/assets/739573c0-380a-4d84-9fd9-141343ce7e68" /> </td> </tr> </table> ### Nightly Rust: 10us ✅ With rustc 1.86.0-nightly (8239a37f9) and `-C opt-level=3 -C target-feature=+avx2,+fma`: <table> <tr> <th>Rust</th> <th>Assembly</th> </tr> <tr> <td> <pre lang="rust"> fn dot(a: &[f32], b: &[f32]) -> f32 { let mut sum = 0.0; for i in 0..a.len() { sum = fadd_algebraic(sum, fmul_algebraic(a[i], b[i])); } sum } </pre> </td> <td> <img src="https://github.com/user-attachments/assets/9dcf953a-2cd7-42f3-bc34-7117de4c5fb9" /> </td> </tr> </table> ### Stable Rust: 84us ❌ With rustc 1.84.1 (e71f9a9a9) and `-C opt-level=3 -C target-feature=+avx2,+fma`: <table> <tr> <th>Rust</th> <th>Assembly</th> </tr> <tr> <td> <pre lang="rust"> fn dot(a: &[f32], b: &[f32]) -> f32 { let mut sum = 0.0; for i in 0..a.len() { sum += a[i] * b[i]; } sum } </pre> </td> <td> <img src="https://github.com/user-attachments/assets/936a1f7e-33e4-4ff8-a732-c3cdfe068dca" /> </td> </tr> </table> # Proposed Change Add `core::intrinsics::f*_algebraic` wrappers to `f16`, `f32`, `f64`, and `f128` gated on a new `float_algebraic` feature. # Alternatives Considered https://github.com/rust-lang/rust/issues/21690 has a lot of good discussion of various options for supporting fast math in Rust, but is still open a decade later because any choice that opts in more than individual operations is ultimately contrary to Rust's design principles. In the mean time, processors have evolved and we're leaving major performance on the table by not supporting vectorization. We shouldn't make users choose between an unstable compiler and an 8x performance hit. # References * https://github.com/rust-lang/rust/issues/21690 * https://github.com/rust-lang/libs-team/issues/532 * https://github.com/rust-lang/rust/issues/136469 * https://github.com/calder/dot-bench * https://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231ps try-job: x86_64-gnu-nopt try-job: x86_64-gnu-aux
2025-04-04Expose algebraic floating point intrinsicsCalder Coalson-13/+207
2025-04-04add new tests for autodiff batching and update old onesManuel Drehwald-2/+118
2025-04-03Auto merge of #132527 - DianQK:gvn-stmt-iter, r=oli-obkbors-16/+24
gvn: Invalid dereferences for all non-local mutations Fixes #132353. This PR removes the computation value by traversing SSA locals through `for_each_assignment_mut`. Because the `for_each_assignment_mut` traversal skips statements which have side effects, such as dereference assignments, the computation may be unsound. Instead of `for_each_assignment_mut`, we compute values by traversing in reverse postorder. Because we compute and use the symbolic representation of values on the fly, I invalidate all old values when encountering a dereference assignment. The current approach does not prevent the optimization of a clone to a copy. In the future, we may add an alias model, or dominance information for dereference assignments, or SSA form to help GVN. r? cjgillot cc `@jieyouxu` #132356 cc `@RalfJung` #133474
2025-04-03Remove `unsound-mir-opts` for `simplify_aggregate_to_copy`dianqk-16/+24
2025-04-03Rollup merge of #139145 - okaneco:safe_splits, r=AmanieuMatthias Krüger-0/+24
slice: Remove some uses of unsafe in first/last chunk methods Remove unsafe `split_at_unchecked` and `split_at_mut_unchecked` in some slice `split_first_chunk`/`split_last_chunk` methods. Replace those calls with the safe `split_at` and `split_at_checked` where applicable. Add codegen tests to check for no panics when calculating the last chunk index using `checked_sub` and `split_at`. Better viewed with whitespace disabled in diff view --- The unchecked calls are mostly manual implementations of the safe methods, but with the safety condition negated from `mid <= len` to `len < mid`. ```rust if self.len() < N { None } else { // SAFETY: We manually verified the bounds of the split. let (first, tail) = unsafe { self.split_at_unchecked(N) }; // Or for the last_chunk methods let (init, last) = unsafe { self.split_at_unchecked(self.len() - N) }; ``` Unsafe is still needed for the pointer array casts. Their safety comments are unmodified.
2025-04-01Rollup merge of #139188 - durin42:llvm-21-LintPass, r=dianqkMatthias Krüger-2/+2
PassWrapper: adapt for llvm/llvm-project@94122d58fc77079a291a3d008914… …006cb509d9db We also have to remove the LLVM argument in cast-target-abi.rs for LLVM 21. I'm not really sure what the best approach here is since that test already uses revisions. We could also fork the test into a copy for LLVM 19-20 and another for LLVM 21, but what I did for now was drop the lint-abort-on-error flag to LLVM figuring that some coverage was better than none, but I'm happy to change this if that was a bad direction. r? dianqk ````@rustbot```` label llvm-main
2025-03-31PassWrapper: adapt for ↵Augie Fackler-2/+2
llvm/llvm-project@94122d58fc77079a291a3d008914006cb509d9db We also have to remove the LLVM argument in cast-target-abi.rs for LLVM 21. I'm not really sure what the best approach here is since that test already uses revisions. We could also fork the test into a copy for LLVM 19-20 and another for LLVM 21, but what I did for now was drop the lint-abort-on-error flag to LLVM figuring that some coverage was better than none, but I'm happy to change this if that was a bad direction. The above also applies for ffi-out-of-bounds-loads.rs. r? dianqk @rustbot label llvm-main
2025-03-31Add tests for LLVM 20 slice bounds check optimizationreez12g-0/+37
2025-03-30slice: Remove some uses of unsafe in first/last chunk methodsokaneco-0/+24
Remove unsafe `split_at_unchecked` and `split_at_mut_unchecked` in some slice `split_first_chunk`/`split_last_chunk` methods. Replace those calls with the safe `split_at` and `split_at_checked` where applicable. Add codegen tests to check for no panics when calculating the last chunk index using `checked_sub` and `split_at`
2025-03-28Auto merge of #138503 - bjorn3:string_merging, r=tmiaskobors-24/+24
Avoid wrapping constant allocations in packed structs when not necessary This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes. try-job: armhf-gnu
2025-03-28Avoid wrapping constant allocations in packed structs when not necessarybjorn3-24/+24
This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes.
2025-03-26Rollup merge of #138818 - khuey:138198, r=jieyouxuStuart Cook-0/+18
Don't produce debug information for compiler-introduced-vars when desugaring assignments. An assignment such as (a, b) = (b, c); desugars to the HIR { let (lhs, lhs) = (b, c); a = lhs; b = lhs; }; The repeated `lhs` leads to multiple Locals assigned to the same DILocalVariable. Rather than attempting to fix that, get rid of the debug info for these bindings that don't even exist in the program to begin with. Fixes #138198 r? `@jieyouxu`
2025-03-25Auto merge of #138634 - saethlin:repeated-uninit, r=scottmcm,oli-obkbors-0/+21
Lower to a memset(undef) when Rvalue::Repeat repeats uninit Fixes https://github.com/rust-lang/rust/issues/138625. It is technically correct to just do nothing. But if we actually do nothing, we may miss that this is de-initializing something, so instead we just lower to a single memset that writes undef. This is still superior to the memcpy loop, in both quality of code we hand to the backend and LLVM's final output.
2025-03-24Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcmbors-26/+37
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics. These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19. I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something. r? `@scottmcm`
2025-03-21Don't produce debug information for compiler-introduced-vars when desugaring ↵Kyle Huey-0/+18
assignments. An assignment such as (a, b) = (b, c); desugars to the HIR { let (lhs, lhs) = (b, c); a = lhs; b = lhs; }; The repeated `lhs` leads to multiple Locals assigned to the same DILocalVariable. Rather than attempting to fix that, get rid of the debug info for these bindings that don't even exist in the program to begin with. Fixes #138198
2025-03-19Lower to a memset(undef) when Rvalue::Repeat repeats uninitBen Kimock-0/+21
2025-03-19Use explicit cpu in some asm and codegen tests.Jesus Checa Hidalgo-2/+2
Some tests expect to be compiled for a specific CPU or require certain target features to be present (or absent). These tests work fine with default CPUs but fail in downstream builds for RHEL and Fedora, where we use non-default CPUs such as z13 on s390x, pwr9 on ppc64le, or x86-64-v2/x86-64-v3 on x86_64.
2025-03-17Auto merge of #127173 - bjorn3:mangle_rustc_std_internal_symbol, ↵bors-6/+6
r=wesleywiser,jieyouxu Mangle rustc_std_internal_symbols functions This reduces the risk of issues when using a staticlib or rust dylib compiled with a different rustc version in a rust program. Currently this will either (in the case of staticlib) cause a linker error due to duplicate symbol definitions, or (in the case of rust dylibs) cause rustc_std_internal_symbols functions to be silently overridden. As rust gets more commonly used inside the implementation of libraries consumed with a C interface (like Spidermonkey, Ruby YJIT (curently has to do partial linking of all rust code to hide all symbols not part of the C api), the Rusticl OpenCL implementation in mesa) this is becoming much more of an issue. With this PR the only symbols remaining with an unmangled name are rust_eh_personality (LLVM doesn't allow renaming it) and `__rust_no_alloc_shim_is_unstable`. Helps mitigate https://github.com/rust-lang/rust/issues/104707 try-job: aarch64-gnu-debug try-job: aarch64-apple try-job: x86_64-apple-1 try-job: x86_64-mingw-1 try-job: i686-mingw-1 try-job: x86_64-msvc-1 try-job: i686-msvc-1 try-job: test-various try-job: armhf-gnu
2025-03-17Rollup merge of #138349 - 1c3t3a:external-weak-cfi, r=rcvalleMatthias Krüger-0/+24
Emit function declarations for functions with `#[linkage="extern_weak"]` Currently, when declaring an extern weak function in Rust, we use the following syntax: ```rust unsafe extern "C" { #[linkage = "extern_weak"] static FOO: Option<unsafe extern "C" fn() -> ()>; } ``` This allows runtime-checking the extern weak symbol through the Option. When emitting LLVM-IR, the Rust compiler currently emits this static as an i8, and a pointer that is initialized with the value of the global i8 and represents the nullabilty e.g. ``` `@FOO` = extern_weak global i8 `@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO` ``` This approach does not work well with CFI, where we need to attach CFI metadata to a concrete function declaration, which was pointed out in https://github.com/rust-lang/rust/issues/115199. This change switches to emitting a proper function declaration instead of a global i8. This allows CFI to work for extern_weak functions. Example: ``` `@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO` ... declare !type !61 !type !62 !type !63 !type !64 extern_weak void `@FOO(double)` unnamed_addr #6 ``` We keep initializing the Rust internal symbol with the function declaration, which preserves the correct behavior for runtime checking the Option. r? `@rcvalle` cc `@jakos-sec` try-job: test-various
2025-03-17Remove implicit #[no_mangle] for #[rustc_std_internal_symbol]bjorn3-6/+6
2025-03-17Stabilize asm_gotoGary Guo-1/+1