about summary refs log tree commit diff
path: root/compiler/rustc_codegen_ssa
AgeCommit message (Collapse)AuthorLines
2025-04-25set subsections_via_symbols for ld64 helper sectionsusamoi-0/+6
2025-04-23Make #![feature(let_chains)] bootstrap conditional in compiler/est31-1/+1
2025-04-20Rollup merge of #137953 - RalfJung:simd-intrinsic-masks, r=WaffleLapkinChris Denton-27/+3
simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first... Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`. I have extremely low confidence in the GCC part of this PR. See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)
2025-04-20simd intrinsics with mask: accept unsigned integer masksRalf Jung-27/+3
2025-04-18Rollup merge of #139615 - nnethercote:rm-name_or_empty, r=jdonszelmannMatthias Krüger-14/+19
Remove `name_or_empty` Another step towards #137978. r? ``@jdonszelmann``
2025-04-17Replace infallible `name_or_empty` methods with fallible `name` methods.Nicholas Nethercote-14/+19
I'm removing empty identifiers everywhere, because in practice they always mean "no identifier" rather than "empty identifier". (An empty identifier is impossible.) It's better to use `Option` to mean "no identifier" because you then can't forget about the "no identifier" possibility. Some specifics: - When testing an attribute for a single name, the commit uses the `has_name` method. - When testing an attribute for multiple names, the commit uses the new `has_any_name` method. - When using `match` on an attribute, the match arms now have `Some` on them. In the tests, we now avoid printing empty identifiers by not printing the identifier in the `error:` line at all, instead letting the carets point out the problem.
2025-04-16fix multiple `#[repr(align(N))]` on functionsFolkert de Vries-1/+2
2025-04-14Share part of the global_asm!() implementation between cg_ssa and cg_clifbjorn3-69/+69
2025-04-14Pass &mut self to codegen_global_asmbjorn3-4/+12
2025-04-14Make codegen_naked_asm publicbjorn3-2/+2
This allows it to be reused by codegen backends that don't use cg_ssa like cg_clif.
2025-04-14Pass MonoItemData to MonoItem::definebjorn3-10/+9
2025-04-14Move codegen_naked_asm call up into MonoItem::definebjorn3-8/+14
2025-04-14Make codegen_naked_asm retrieve the MIR Body itselfbjorn3-5/+6
2025-04-14Only require a CodegenCx for codegen_naked_asmbjorn3-9/+17
2025-04-14Don't begin defining a function when codegening a naked functionbjorn3-7/+7
While LLVM is rather permissive in this regards, some other codegen backends demand that once you declare a function for definition you actually define contents of the function, which doesn't happen for naked functions as we actually generate assembly for them.
2025-04-14Handle protected visibility in codegen_naked_asmbjorn3-4/+7
2025-04-14Use START_BLOCK in codegen_naked_asmbjorn3-2/+2
2025-04-14Auto merge of #124141 - ↵bors-1/+0
nnethercote:rm-Nonterminal-and-TokenKind-Interpolated, r=petrochenkov Remove `Nonterminal` and `TokenKind::Interpolated` A third attempt at this; the first attempt was #96724 and the second was #114647. r? `@ghost`
2025-04-11Auto merge of #139453 - compiler-errors:incr, r=jieyouxubors-29/+53
Prepend temp files with per-invocation random string to avoid temp filename conflicts https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management. Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name: ``` #[inline(never)] #[cfg(any(rpass1, rpass3))] fn a() -> i32 { 0 } #[cfg(any(cfail2))] fn a() -> i32 { 1 } fn main() { evil::evil(); assert_eq!(a(), 0); } mod evil { #[cfg(any(rpass1, rpass3))] pub fn evil() { unsafe { std::arch::asm!("/* */"); } } #[cfg(any(cfail2))] pub fn evil() { unsafe { std::arch::asm!("missing"); } } } ``` Session 1 (`rpass1`): * Type-check, borrow-check, etc. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd. * Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`. * Save-temps option means we don't delete `XXX.rgcu.o`. * Link the binary and stuff. * Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name). Session 2 (`cfail2`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. * Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o`. **HERE IS THE PROBLEM**: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory. * Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the *previous* session is the last finalized session. Session 3 (`rpass3`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. NOTE that this is from session 1. * All the dep graph nodes are green since we are basically replaying session 1. * codegen object file `XXX.o`, which is detected as *reused* from session 1 since dep nodes were green. That means we **reuse** `XXX.o` which had been dirtied from session 2. * Link the binary and stuff. This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1. At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst **unsound**. This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example). --- This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not *deterministic*, but temporary files are transient anyways, so I don't believe this is a problem. That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc. Fixes https://github.com/rust-lang/rust/issues/139407 [^1]: https://github.com/rust-lang/rust/blob/175dcc7773d65c1b1542c351392080f48c05799f/compiler/rustc_fs_util/src/lib.rs#L60 [^2]: https://github.com/rust-lang/rust/blob/175dcc7773d65c1b1542c351392080f48c05799f/compiler/rustc_incremental/src/persist/fs.rs#L1-L40
2025-04-10Remove the use of Rayon iteratorsJohn Kåre Alsaker-3/+3
2025-04-10Auto merge of #139088 - spastorino:ergonomic-ref-counting-2, r=nikomatsakisbors-7/+72
Ergonomic ref counting: optimize away clones when possible This PR build on top of https://github.com/rust-lang/rust/pull/134797. It optimizes codegen of ergonomic ref-counting when the type being `use`d is only known to be copy after monomorphization. We avoid codening a clone and generate bitwise copy instead. RFC: https://github.com/rust-lang/rfcs/pull/3680 Tracking issue: https://github.com/rust-lang/rust/issues/132290 Project goal: https://github.com/rust-lang/rust-project-goals/issues/107 r? `@nikomatsakis` This PR could better sit on top of https://github.com/rust-lang/rust/pull/131650 but as it did not land yet I've decided to just do minimal changes. It may be the case that doing what I'm doing regress the performance and we may need to go the full route of https://github.com/rust-lang/rust/pull/131650. cc `@saethlin` in this regard.
2025-04-08Rollup merge of #139098 - scottmcm:assert-impossible-tags, r=WaffleLapkinStuart Cook-1/+31
Tell LLVM about impossible niche tags I was trying to find a better way of emitting discriminant calculations, but sadly had no luck. So here's a fairly small PR with the bits that did seem worth bothering: 1. As the [`TagEncoding::Niche` docs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/enum.TagEncoding.html#variant.Niche) describe, it's possible to end up with a dead value in the input that's not already communicated via the range parameter attribute nor the range load metadata attribute. So this adds an `llvm.assume` in non-debug mode to tell LLVM about that. (That way it can tell that the sides of the `select` have disjoint possible values.) 2. I'd written a bunch more tests, or at least made them parameterized, in the process of trying things out, so this checks in those tests to hopefully help future people not trip on the same weird edge cases, like when the tag type is `i8` but yet there's still a variant index and discriminant of `258` which doesn't fit in that tag type because the enum is really weird.
2025-04-07Address PR feedbackScott McMurray-1/+3
2025-04-07Prepend temp files with a string per invocation of rustcMichael Goulet-16/+47
2025-04-07Simplify temp path creation a bitMichael Goulet-27/+20
2025-04-07Only clone mir body if tcx.features().ergonomic_clones()Santiago Pastorino-8/+10
2025-04-07Optimize codegen of use values that are copy post monomorphizationSantiago Pastorino-4/+66
2025-04-07Use a local var for tcxSantiago Pastorino-4/+5
2025-04-06Auto merge of #138947 - madsmtm:refactor-apple-versions, r=Noratriebbors-159/+13
Refactor Apple version handling in the compiler Move various Apple version handling code in the compiler out `rustc_codegen_ssa` and into a place where it can be accessed by `rustc_attr_parsing`, which I found to be necessary when doing https://github.com/rust-lang/rust/pull/136867. Thought I'd split it out to make it easier to land, and to make further changes like https://github.com/rust-lang/rust/pull/131477 have fewer conflicts / PR dependencies. There should be no functional changes in this PR. `@rustbot` label O-apple r? rust-lang/compiler
2025-04-05Tell LLVM about impossible niche tagsScott McMurray-0/+28
2025-04-05Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obkStuart Cook-4/+28
Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283
2025-04-04refactor: Move env parsing of deployment target to rustc_sessionMads Marquart-70/+5
2025-04-04refactor: Move Apple OSVersion (back) to rustc_targetMads Marquart-97/+16
Also convert OSVersion into a proper struct for better type-safety.
2025-04-04Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkinMatthias Krüger-26/+26
Rename `is_like_osx` to `is_like_darwin` Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.). ``@rustbot`` label O-apple r? compiler
2025-04-03add autodiff batching middle-endManuel Drehwald-4/+28
2025-04-03add the autodiff batch mode frontendManuel Drehwald-1/+1
2025-04-03Make LevelAndSource a structOli Scherer-5/+4
2025-04-02Remove `recursion_limit` increases.Nicholas Nethercote-1/+0
These are no longer needed now that `Nonterminal` is gone.
2025-03-31Store only a metadata stub into `rlibs` and `dylibs` with `-Zembed-metadata=no`Jakub Beránek-3/+3
2025-03-28use `slice::contains` where applicableYotam Ofek-2/+2
2025-03-27Rollup merge of #139010 - madsmtm:parse-xcrun-better, r=wesleywiserJacob Pratt-82/+281
Improve `xcrun` error handling The compiler invokes `xcrun` on macOS when linking Apple targets, to find the Xcode SDK which contain all the necessary linker stubs. The error messages that `xcrun` outputs aren't always that great though, so this PR tries to improve that by providing extra context when an error occurs. Fixes https://github.com/rust-lang/rust/issues/56829. Fixes https://github.com/rust-lang/rust/issues/84534. Part of https://github.com/rust-lang/rust/issues/129432. See also the alternative https://github.com/rust-lang/rust/pull/131433. Tested on: - `x86_64-apple-darwin`, MacBook Pro running Mac OS X 10.12.6 - With no tooling installed - With Xcode 9.2 - With Xcode 9.2 Commandline Tools - `aarch64-apple-darwin`, MacBook M2 Pro running macOS 14.7.4 - With Xcode 13.4.1 - With Xcode 16.2 - Inside `nix-shell -p xcbuild` (nixpkgs' `xcrun` shim) - `aarch64-apple-darwin`, VM running macOS 15.3.1 - With no tooling installed - With Xcode 16.2 Commandline Tools ``@rustbot`` label O-apple r? compiler CC ``@BlackHoleFox`` ``@thomcc``
2025-03-27Emit better error messages when invoking xcrunMads Marquart-49/+263
Also allow the SDK path to be non-UTF-8.
2025-03-27Invoke xcrun inside sess.timeMads Marquart-1/+1
It can be a fairly expensive operation when the output is not cached, so it's nice to get some visibility into the runtime cost.
2025-03-27refactor: Move Apple SDK names to rustc_codegen_ssa::back::appleMads Marquart-36/+21
2025-03-26Always emit native-static-libs note, even if it is emptyMads Marquart-9/+5
2025-03-26Auto merge of #138893 - klensy:thorin-0.9, r=Mark-Simulacrumbors-1/+1
bump thorin to 0.9 to drop duped deps Bumps `thorin`, removing duped deps. This also changes features for hashbrown: ``` hashbrown v0.15.2 `-- indexmap v2.7.0 |-- object v0.36.7 |-- wasmparser v0.219.1 |-- wasmparser v0.223.0 `-- wit-component v0.223.0 |-- indexmap feature "default" |-- indexmap feature "serde" `-- indexmap feature "std" |-- hashbrown feature "default-hasher" | |-- object v0.36.7 (*) | `-- wasmparser v0.223.0 (*) |-- hashbrown feature "nightly" | |-- rustc_data_structures v0.0.0 | `-- rustc_query_system v0.0.0 `-- hashbrown feature "serde" `-- wasmparser feature "serde" ``` to ``` hashbrown v0.15.2 `-- indexmap v2.7.0 |-- object v0.36.7 |-- wasmparser v0.219.1 |-- wasmparser v0.223.0 `-- wit-component v0.223.0 |-- indexmap feature "default" |-- indexmap feature "serde" `-- indexmap feature "std" |-- hashbrown feature "allocator-api2" | `-- hashbrown feature "default" |-- hashbrown feature "default" (*) |-- hashbrown feature "default-hasher" | |-- object v0.36.7 (*) | `-- wasmparser v0.223.0 (*) | `-- hashbrown feature "default" (*) |-- hashbrown feature "equivalent" | `-- hashbrown feature "default" (*) |-- hashbrown feature "inline-more" | `-- hashbrown feature "default" (*) |-- hashbrown feature "nightly" | |-- rustc_data_structures v0.0.0 | `-- rustc_query_system v0.0.0 |-- hashbrown feature "raw-entry" | `-- hashbrown feature "default" (*) `-- hashbrown feature "serde" `-- wasmparser feature "serde" ``` To be safe, as this can be perf-sensitive: `@bors` rollup=never
2025-03-26Auto merge of #138956 - jhpratt:rollup-6g7ppwd, r=jhprattbors-12/+17
Rollup of 11 pull requests Successful merges: - #138128 (Stabilize `#![feature(precise_capturing_in_traits)]`) - #138834 (Group test diffs by stage in post-merge analysis) - #138867 (linker: Fix staticlib naming for UEFI) - #138874 (Batch mark waiters as unblocked when resuming in the deadlock handler) - #138875 (Trusty: Fix build for anonymous pipes and std::sys::process) - #138877 (Ignore doctests only in specified targets) - #138885 (Fix ui pattern_types test for big-endian platforms) - #138905 (Add target maintainer information for powerpc64-unknown-linux-musl) - #138911 (Allow defining opaques in statics and consts) - #138917 (rustdoc: remove useless `Symbol::is_empty` checks.) - #138945 (Override PartialOrd methods for bool) r? `@ghost` `@rustbot` modify labels: rollup
2025-03-25Rollup merge of #138867 - petrochenkov:linkfix, r=nnethercoteJacob Pratt-12/+17
linker: Fix staticlib naming for UEFI And one minor refactoring in the second commit.
2025-03-26Auto merge of #138601 - RalfJung:wasm-abi-fcw, r=alexcrichtonbors-2/+2
add FCW to warn about wasm ABI transition See https://github.com/rust-lang/rust/issues/122532 for context: the "C" ABI on wasm32-unk-unk will change. The goal of this lint is to warn about any function definition and calls whose behavior will be affected by the change. My understanding is the following: - scalar arguments are fine - including 128 bit types, they get passed as two `i64` arguments in both ABIs - `repr(C)` structs (recursively) wrapping a single scalar argument are fine (unless they have extra padding due to over-alignment attributes) - all return values are fine `@bjorn3` `@alexcrichton` `@Manishearth` is that correct? I am making this a "show up in future compat reports" lint to maximize the chances people become aware of this. OTOH this likely means warnings for most users of Diplomat so maybe we shouldn't do this? IIUC, wasm-bindgen should be unaffected by this lint as they only pass scalar types as arguments. Tracking issue: https://github.com/rust-lang/rust/issues/138762 Transition plan blog post: https://github.com/rust-lang/blog.rust-lang.org/pull/1531 try-job: dist-various-2
2025-03-25Rename `is_like_osx` to `is_like_darwin`Mads Marquart-26/+26