about summary refs log tree commit diff
path: root/compiler/rustc_codegen_ssa/src
AgeCommit message (Collapse)AuthorLines
2025-07-27Auto merge of #144347 - scottmcm:ssa-enums-v0, r=WaffleLapkinbors-64/+79
No longer need `alloca`s for consuming `Result<!, i32>` and similar In optimized builds GVN gets rid of these already, but in `opt-level=0` we actually make `alloca`s for this, which particularly impacts `?`-style things that use actually-only-one-variant types like this. While doing so, rewrite `LocalAnalyzer::process_place` to be non-recursive, solving a 6+ year old FIXME. r? codegen
2025-07-26Auto merge of #143860 - scottmcm:transmute-always-rvalue, r=WaffleLapkinbors-96/+66
Let `codegen_transmute_operand` just handle everything When combined with rust-lang/rust#143720, this means `rvalue_creates_operand` can just return `true` for *every* `Rvalue`. (A future PR could consider removing it, though just letting it optimize out is fine for now.) It's nicer anyway, IMHO, because it avoids needing the layout checks to be consistent in the two places, and thus is an overall reduction in code. Plus it's a more helpful building block when used in other places this way. (TBH, it probably would have been better to have it this way the whole time, but I clearly didn't understand `rvalue_creates_operand` when I originally wrote rust-lang/rust#109843.)
2025-07-26Remove support for -Zcombine-cgubjorn3-42/+6
Nobody seems to actually use this, while still adding some extra complexity to the already rather complex codegen coordinator code. It is also not supported by any backend other than the LLVM backend.
2025-07-26Implement support for explicit tail calls in the MIR block builders and the ↵Joel Wejdenstål-11/+74
LLVM codegen backend.
2025-07-25Improve coordinator channel handlingbjorn3-40/+25
Remove usage of Any, reduce visibility of fields and remove unused backend arguments.
2025-07-25Rollup merge of #144209 - scottmcm:assume_less, r=lcnr,dianqkMatthias Krüger-4/+24
Don't emit two `assume`s in transmutes when one is a subset of the other For example, transmuting between `bool` and `Ordering` doesn't need two `assume`s because one range is a superset of the other. Multiple are still used for things like `char` <-> `NonZero<u32>`, which overlap but where neither fully contains the other.
2025-07-24Auto merge of #144398 - fmease:rollup-z6vq7mi, r=fmeasebors-28/+106
Rollup of 15 pull requests Successful merges: - rust-lang/rust#143374 (Unquerify extern_mod_stmt_cnum.) - rust-lang/rust#143838 (std: net: uefi: Add support to query connection data) - rust-lang/rust#144014 (don't link to the nightly version of the Edition Guide in stable lints) - rust-lang/rust#144094 (Ensure we codegen the main fn) - rust-lang/rust#144218 (Use serde for target spec json deserialize) - rust-lang/rust#144221 (generate elf symbol version in raw-dylib) - rust-lang/rust#144240 (Add more test case to check if the false note related to sealed trait suppressed) - rust-lang/rust#144247 (coretests/num: use ldexp instead of hard-coding a power of 2) - rust-lang/rust#144276 (Use less HIR in check_private_in_public.) - rust-lang/rust#144278 (add Rev::into_inner) - rust-lang/rust#144317 (pass build.npm from bootstrap to tidy and use it for npm install) - rust-lang/rust#144320 (rustdoc: avoid allocating a temp String for aliases in search index) - rust-lang/rust#144334 (rustc_resolve: get rid of unused rustdoc::span_of_fragments_with_expansion) - rust-lang/rust#144335 (Don't suggest assoc ty bound on non-angle-bracketed problematic assoc ty binding) - rust-lang/rust#144358 (Stop using the old `validate_attr` logic for stability attributes) r? `@ghost` `@rustbot` modify labels: rollup
2025-07-24Rollup merge of #144221 - usamoi:versym, r=bjorn3León Orell Valerian Liehr-28/+106
generate elf symbol version in raw-dylib For link names like `aaa@bbb`, it generates a symbol named `aaa` and a version named `bbb`. For link names like `aaa\0bbb`, `aaa@`@bbb`` or `aa@bb@cc`, it emits errors. It adds a test that the executable is linked with glibc using raw-dylib. cc rust-lang/rust#135694
2025-07-24Auto merge of #144062 - bjorn3:lto_refactors2, r=davidtwcobors-106/+193
Various refactors to the LTO handling code (part 2) Continuing from https://github.com/rust-lang/rust/pull/143388 this removes a bit of dead code and moves the LTO symbol export calculation from individual backends to cg_ssa.
2025-07-24generate elf symbol version in raw-dylibusamoi-28/+106
2025-07-23Remove useless lifetime parameter.Camille GILLOT-4/+4
2025-07-23Give an AllocId to ConstValue::Slice.Camille GILLOT-5/+2
2025-07-23Don't emit two `assume`s in transmutes when one is a subset of the otherScott McMurray-4/+24
For example, transmuting between `bool` and `Ordering` doesn't need two `assume`s because one range is a superset of the other. Multiple are still used for things like `char` <-> `NonZero<u32>`, which overlap but where neither fully contains the other.
2025-07-23Remove `rvalue_creates_operand` entirelyScott McMurray-47/+2
Split to a separate commit to it could be reverted later if necessary, should we get new `Rvalue`s where we can't handle it this way.
2025-07-23re-enable direct `bitcast`s for Int/Float vector transmutes (but not ones ↵Scott McMurray-0/+21
involving pointers)
2025-07-23Let `codegen_transmute_operand` just handle everythingScott McMurray-49/+43
When combined with 143720, this means `rvalue_creates_operand` can just return `true` for *every* `Rvalue`. (A future PR could consider removing it, though just letting it optimize out is fine for now.) It's nicer anyway, IMHO, because it avoids needing the layout checks to be consistent in the two places, and thus is an overall reduction in code. Plus it's a more helpful building block when used in other places this way.
2025-07-23No longer need `alloca`s for consuming `Result<!, i32>` and similarScott McMurray-64/+79
In optimized builds GVN gets rid of these already, but in `opt-level=0` we actually make `alloca`s for this, which particularly impacts `?`-style things that use actually-only-one-variant types like this.
2025-07-23atomicrmw on pointers: move integer-pointer cast hacks into backendRalf Jung-16/+77
2025-07-23Don't special-case llvm.* as nounwindAlisa Sireneva-9/+0
Certain LLVM intrinsics, such as `llvm.wasm.throw`, can unwind. Marking them as nounwind causes us to skip cleanup of locals and optimize out `catch_unwind` under inlining or when `llvm.wasm.throw` is used directly by user code. The motivation for forcibly marking llvm.* as nounwind is no longer present: most intrinsics are linked as `extern "C"` or other non-unwinding ABIs, so we won't codegen `invoke` for them anyway.
2025-07-22Rename `tests/codegen` into `tests/codegen-llvm`Guillaume Gomez-1/+1
2025-07-22Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk许杰友 Jieyou Xu (Joe)-0/+2
gpu offload host code generation r? ghost This will generate most of the host side code to use llvm's offload feature. The first PR will only handle automatic mem-transfers to and from the device. So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch. Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening. A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU. A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues. I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work. This work will also be compatible with std::autodiff, so one can differentiate GPU kernels. Tracking: - https://github.com/rust-lang/rust/issues/131513
2025-07-21Remove each_linked_rlib_for_lto from CodegenContextbjorn3-12/+33
2025-07-21Move exported_symbols_for_lto out of CodegenContextbjorn3-8/+23
2025-07-21Merge exported_symbols computation into exported_symbols_for_ltobjorn3-80/+62
And move exported_symbols_for_lto call from backends to cg_ssa.
2025-07-21Move LTO symbol export calculation from backends to cg_ssabjorn3-0/+100
2025-07-21Remove worker idbjorn3-56/+19
It isn't used anywhere. Also inline free_worker into the only call site.
2025-07-21Merge modules and cached_modules for fat LTObjorn3-5/+11
The modules vec can already contain serialized modules and there is no need to distinguish between cached and non-cached cgus at LTO time.
2025-07-20Ban projecting into SIMD types [MCP838]Scott McMurray-14/+5
2025-07-20Rollup merge of #144143 - Gelbpunkt:target-features-crt-static, r=RalfJungGuillaume Gomez-2/+2
Fix `-Ctarget-feature`s getting ignored after `crt-static` The current behaviour introduced by commit a50a3b8e318594c41783294e440d864763e412ef would discard any target features specified after `crt-static` (the only member of `RUSTC_SPECIFIC_FEATURES`). This is because it returned instead of continuing processing the next feature. I wasn't entirely sure how the regression test should look like, but this one should do. If anyone has some suggestions, I'm happy to learn, it's my first test :) I've confirmed that the test fails without the fix on `powerpc64le-unknown-linux-musl` and `x86_64-unknown-linux-gnu`. cc ``@RalfJung``
2025-07-19Allow `Rvalue::Repeat` to return true in `rvalue_creates_operand` tooScott McMurray-4/+12
The conversation in 143502 made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all. This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true. I'll try out such a world next :)
2025-07-19Auto merge of #143784 - scottmcm:enums-again-new-ex2, r=dianqkbors-29/+96
Simplify discriminant codegen for niche-encoded variants which don't wrap across an integer boundary Inspired by rust-lang/rust#139729, this attempts to be a much-simpler and more-localized change while still making a difference. (Specifically, this does not try to solve the problem with select-sinking, leaving that to be fixed by https://github.com/llvm/llvm-project/issues/134024 -- once it gets released -- instead of in rustc's codegen.) What this *does* improve is checking for the variant in a 3+ variant enum when that variant is the type providing the niche. Something like `if let Foo::WithBool(_) = ...` previously compiled to `ugt(add(x, -2), 2)`, which is non-trivial to think about because it's depending on the unsigned wrapping to shift the 0/1 up above 2. With this PR it compiles to just `ult(x, 2)`, which is probably what you'd have written yourself if you were doing it by hand to look for "is this byte a bool?". That's done by leaving most of the codegen alone, but adding a couple new special cases to the `is_niche` check. The default looks at the relative discriminant, but in the common cases where there's no wraparound involved, we can just check the original value, rather than the offsetted one. The first commit just adds some tests, so the best way to see the effect of this change is to look at the second commit and how it updates the test expectations.
2025-07-18add -Zoffload=Enable flag behind -Zunstable-options, to enable gpu (host) ↵Manuel Drehwald-0/+2
code generation
2025-07-18rustc_codegen_ssa: Don't skip target-features after crt-staticJens Reidel-2/+2
The current behaviour introduced by commit a50a3b8e318594c41783294e440d864763e412ef would discard any target features specified after crt-static (the only member of RUSTC_SPECIFIC_FEATURES). This is because it returned instead of continuing processing the next flag. Signed-off-by: Jens Reidel <adrian@travitia.xyz>
2025-07-18Rollup merge of #143846 - usamoi:gc, r=bjorn3Matthias Krüger-39/+1
pass --gc-sections if -Zexport-executable-symbols is enabled and improve tests Exported symbols are added as GC roots in linking, so `--gc-sections` won't hurt `-Zexport-executable-symbols`. Fixes the run-make test to work on Linux. Enable the ui test on more targets. cc rust-lang/rust#84161
2025-07-18Rollup merge of #143293 - folkertdev:naked-function-kcfi, r=compiler-errorsMatthias Krüger-14/+17
fix `-Zsanitizer=kcfi` on `#[naked]` functions fixes https://github.com/rust-lang/rust/issues/143266 With `-Zsanitizer=kcfi`, indirect calls happen via generated intermediate shim that forwards the call. The generated shim preserves the attributes of the original, including `#[unsafe(naked)]`. The shim is not a naked function though, and violates its invariants (like having a body that consists of a single `naked_asm!` call). My fix here is to match on the `InstanceKind`, and only use `codegen_naked_asm` when the instance is not a `ReifyShim`. That does beg the question whether there are other `InstanceKind`s that could come up. As far as I can tell the answer is no: calling via `dyn` seems to work find, and `#[track_caller]` is disallowed in combination with `#[naked]`. r? codegen ````@rustbot```` label +A-naked cc ````@maurer```` ````@rcvalle````
2025-07-17remove no_gc_sectionsusamoi-35/+0
2025-07-17Rollup merge of #143388 - bjorn3:lto_refactors, r=compiler-errorsLeón Orell Valerian Liehr-161/+106
Various refactors to the LTO handling code In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
2025-07-16use `codegen_instance_attrs` where an instance is (easily) availableFolkert de Vries-5/+12
2025-07-16add `codegen_instance_attrs` queryFolkert de Vries-11/+6
and use it for naked functions
2025-07-16fix `-Zsanitizer=kcfi` on `#[naked]` functionsFolkert de Vries-6/+7
And more broadly only codegen `InstanceKind::Item` using the naked function codegen code. Other instance kinds should follow the normal path.
2025-07-16Rollup merge of #143920 - oli-obk:cg-llvm-safety, r=jieyouxuSamuel Tardieu-4/+5
Make more of codegen_llvm safe Best reviewed commit-by-commit.
2025-07-15Improve comments inside `codegen_get_discr`Scott McMurray-2/+46
2025-07-14Eliminate all direct uses of LLVMMDStringInContext2Oli Scherer-4/+5
2025-07-13Port `#[link_ordinal]` to the new attribute parsing infrastructure.Anne Stijns-61/+4
2025-07-13pass --gc-sections if -Zexport-executable-symbols is enabled and improve testsusamoi-6/+3
2025-07-12Simplify codegen for niche-encoded variant testsScott McMurray-27/+50
2025-07-12Auto merge of #143810 - matthiaskrgr:rollup-iw7a23z, r=matthiaskrgrbors-2/+29
Rollup of 9 pull requests Successful merges: - rust-lang/rust#143403 (Port several trait/coherence-related attributes the new attribute system) - rust-lang/rust#143633 (fix: correct assertion to check for 'noinline' attribute presence before removal) - rust-lang/rust#143647 (Clarify and expand documentation for std::sys_common dependency structure) - rust-lang/rust#143716 (compiler: doc/comment some codegen-for-functions interfaces) - rust-lang/rust#143747 (Add target maintainer information for aarch64-unknown-linux-musl) - rust-lang/rust#143759 (Fix typos in function names in the `target_feature` test) - rust-lang/rust#143767 (Bump `src/tools/x` to Edition 2024 and some cleanups) - rust-lang/rust#143769 (Remove support for SwitchInt edge effects in backward dataflow) - rust-lang/rust#143770 (build-helper: clippy fixes) r? `@ghost` `@rustbot` modify labels: rollup
2025-07-12Auto merge of #143766 - matthiaskrgr:rollup-0x7t69s, r=matthiaskrgrbors-10/+14
Rollup of 8 pull requests Successful merges: - rust-lang/rust#142391 (rust: library: Add `setsid` method to `CommandExt` trait) - rust-lang/rust#143302 (`tests/ui`: A New Order [27/N]) - rust-lang/rust#143303 (`tests/ui`: A New Order [28/28] FINAL PART) - rust-lang/rust#143568 (std: sys: net: uefi: tcp4: Add timeout support) - rust-lang/rust#143611 (Mention more APIs in `ParseIntError` docs) - rust-lang/rust#143661 (chore: Improve how the other suggestions message gets rendered) - rust-lang/rust#143708 (fix: Include frontmatter in -Zunpretty output ) - rust-lang/rust#143718 (Make UB transmutes really UB in LLVM) r? `@ghost` `@rustbot` modify labels: rollup try-job: i686-gnu-nopt-1 try-job: test-various
2025-07-11Rollup merge of #143716 - ↵Matthias Krüger-2/+29
workingjubilee:document-some-codegen-backend-stuff, r=bjorn3,fee1-dead compiler: doc/comment some codegen-for-functions interfaces An out-of-date comment gets updated and some underdocumented functions get documented.
2025-07-11compiler: comment on some call-related codegen fn in cg_ssaJubilee Young-2/+29
Partially documents the situation due to LLVM CFI.