about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src
AgeCommit message (Collapse)AuthorLines
2025-07-22Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk许杰友 Jieyou Xu (Joe)-11/+577
gpu offload host code generation r? ghost This will generate most of the host side code to use llvm's offload feature. The first PR will only handle automatic mem-transfers to and from the device. So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch. Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening. A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU. A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues. I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work. This work will also be compatible with std::autodiff, so one can differentiate GPU kernels. Tracking: - https://github.com/rust-lang/rust/issues/131513
2025-07-20Rollup merge of #144116 - nikic:llvm-21-fixes, r=dianqkMatthias Krüger-0/+12
Fixes for LLVM 21 This fixes compatibility issues with LLVM 21 without performing the actual upgrade. Split out from https://github.com/rust-lang/rust/pull/143684. This fixes three issues: * Updates the AMDGPU data layout for address space 8. * Makes emit-arity-indicator.rs a no_core test, so it doesn't fail on non-x86 hosts. * Explicitly sets the exception model for wasm, as this is no longer implied by `-wasm-enable-eh`.
2025-07-18gpu host code generationManuel Drehwald-7/+464
2025-07-18add various wrappers for gpu code generationManuel Drehwald-2/+103
2025-07-18add -Zoffload=Enable flag behind -Zunstable-options, to enable gpu (host) ↵Manuel Drehwald-0/+6
code generation
2025-07-18make more builder functions genericManuel Drehwald-2/+4
2025-07-18Pass wasm exception model to TargetOptionsNikita Popov-0/+7
This is no longer implied by -wasm-enable-eh.
2025-07-18Update AMDGPU data layoutNikita Popov-0/+5
2025-07-18Rollup merge of #143293 - folkertdev:naked-function-kcfi, r=compiler-errorsMatthias Krüger-4/+4
fix `-Zsanitizer=kcfi` on `#[naked]` functions fixes https://github.com/rust-lang/rust/issues/143266 With `-Zsanitizer=kcfi`, indirect calls happen via generated intermediate shim that forwards the call. The generated shim preserves the attributes of the original, including `#[unsafe(naked)]`. The shim is not a naked function though, and violates its invariants (like having a body that consists of a single `naked_asm!` call). My fix here is to match on the `InstanceKind`, and only use `codegen_naked_asm` when the instance is not a `ReifyShim`. That does beg the question whether there are other `InstanceKind`s that could come up. As far as I can tell the answer is no: calling via `dyn` seems to work find, and `#[track_caller]` is disallowed in combination with `#[naked]`. r? codegen ````@rustbot```` label +A-naked cc ````@maurer```` ````@rcvalle````
2025-07-17Rollup merge of #143388 - bjorn3:lto_refactors, r=compiler-errorsLeón Orell Valerian Liehr-47/+29
Various refactors to the LTO handling code In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
2025-07-16use `codegen_instance_attrs` where an instance is (easily) availableFolkert de Vries-4/+4
2025-07-16Rollup merge of #143920 - oli-obk:cg-llvm-safety, r=jieyouxuSamuel Tardieu-249/+216
Make more of codegen_llvm safe Best reviewed commit-by-commit.
2025-07-14Eliminate all direct uses of LLVMMDStringInContext2Oli Scherer-24/+21
2025-07-14Use context methods instead of directly calling FFIOli Scherer-25/+9
2025-07-14Merge `typeid_metadata` and `create_metadata`Oli Scherer-18/+17
2025-07-14Shrink some `unsafe` blocks in cg_llvmOli Scherer-139/+137
2025-07-14Avoid a bunch of unnecessary `unsafe` blocks in cg_llvmOli Scherer-65/+54
2025-07-12Port `#[omit_gdb_pretty_printer_section]` to the new attribute parsing ↵Jonathan Brouwer-3/+2
infrastructure Signed-off-by: Jonathan Brouwer <jonathantbrouwer@gmail.com>
2025-07-11Rollup merge of #143633 - dillona:noinline-assert, r=fee1-deadMatthias Krüger-1/+1
fix: correct assertion to check for 'noinline' attribute presence before removal
2025-07-11Auto merge of #142911 - mejrs:unsized, r=compiler-errorsbors-16/+0
Remove support for dynamic allocas Followup to rust-lang/rust#141811
2025-07-10Rollup merge of #143722 - oli-obk:sound-llvm, r=dianqkTrevor Gross-7/+9
Make some "safe" llvm ops actually sound Noticed while doing other refactorings it may cause some extra unnecessary allocations, but the current use sites are rare ones anyway
2025-07-10Rollup merge of #143632 - dillona:ffi-param-names, r=jieyouxuMatthias Krüger-2/+2
fix: correct parameter names in LLVMRustBuildMinNum and LLVMRustBuildMaxNum FFI declarations
2025-07-10Rollup merge of #143599 - folkertdev:x86-asm-syntax-global-naked-asm, r=AmanieuMatthias Krüger-7/+15
emit `.att_syntax` when global/naked asm use that option fixes https://github.com/rust-lang/rust/issues/143542 LLVM would error when using `-Cllvm-args=-x86-asm-syntax=intel` in combination with global/naked assembly with `att_syntax`. It turns out that for LLVM you do in this case need to emit `.att_syntax`. r? `@Amanieu`
2025-07-10Make some "safe" llvm ops actually soundOli Scherer-7/+9
2025-07-09Add opaque TypeId handles for CTFEOli Scherer-10/+13
2025-07-09emit `.att_syntax` when global/naked asm use that optionFolkert de Vries-7/+15
2025-07-08fix: correct assertion to check for 'noinline' attribute presence before removalDillon Amburgey-1/+1
2025-07-08fix: correct parameter names in LLVMRustBuildMinNum and LLVMRustBuildMaxNum ↵Dillon Amburgey-2/+2
FFI declarations
2025-07-07Remove support for dynamic allocasmejrs-16/+0
2025-07-07Auto merge of #143601 - matthiaskrgr:rollup-9iw2sqk, r=matthiaskrgrbors-1/+0
Rollup of 9 pull requests Successful merges: - rust-lang/rust#132469 (Do not suggest borrow that is already there in fully-qualified call) - rust-lang/rust#143340 (awhile -> a while where appropriate) - rust-lang/rust#143438 (Fix the link in `rustdoc.md`) - rust-lang/rust#143539 (Regression tests for repr ICEs) - rust-lang/rust#143566 (Fix `x86_64-unknown-netbsd` platform support page) - rust-lang/rust#143572 (Remove unused allow attrs) - rust-lang/rust#143583 (`loop_match`: fix 'no terminator on block') - rust-lang/rust#143584 (make `Machine::load_mir` infallible) - rust-lang/rust#143591 (Fix missing words in future tracking issue) r? `@ghost` `@rustbot` modify labels: rollup
2025-07-07Rollup merge of #143572 - yotamofek:pr/unused-allow-attrs, r=fee1-deadMatthias Krüger-1/+0
Remove unused allow attrs These `#[allow]`s seem to be unused (at least according to `x check`, didn't run `x test` locally). Let's clean them up! 🧹
2025-07-07Auto merge of #143182 - xdoardo:more-addrspace, r=workingjubileebors-50/+59
Allow custom default address spaces and parse `p-` specifications in the datalayout string Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout). This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications. The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run. r? workingjubilee
2025-07-07Remove unused allow attrsYotam Ofek-1/+0
2025-07-07compiler: Deduplicate `must_emit_unwind_tables()` commentsMartin Nordholts-15/+0
There is one comment at a call site and one comment in the function definition that are mostly saying the same thing. Fold the call site comment into the function definition comment to reduce duplication. There are actually some inaccuracies in the comments but let's deduplicate before we address the inaccuracies.
2025-07-07rustc_codegen_llvm: Remove reference to non-existing `no_landing_pads()`Martin Nordholts-6/+5
Removing this reference was forgotten in eb4725fc54056. Grepping for no_landing_pads returns no hits after this.
2025-07-07compiler: Parse `p-` specs in datalayout string, allow definition of custom ↵Edoardo Marangoni-50/+59
default data address space
2025-07-05use `div_ceil` instead of manual logicFolkert de Vries-2/+2
2025-07-05use `is_multiple_of` instead of manual moduloFolkert de Vries-1/+1
2025-07-04Rollup merge of #143387 - dpaoliello:shouldpanicfn, r=bjorn3Matthias Krüger-8/+37
Make __rust_alloc_error_handler_should_panic a function Fixes rust-lang/rust#143253 `__rust_alloc_error_handler_should_panic` is a static but was being exported as a function. For most targets this doesn't matter, but Arm64EC Windows uses different decorations for exported variables vs functions, hence it fails to link when `-Z oom=abort` is enabled. We've had issues in the past with statics like this (see rust-lang/rust#141061) but the tldr; is that Arm64EC needs symbols correctly exported as either a function or data, and data MUST and MUST ONLY be marked `dllimport` when the symbol is being imported from another binary, which is non-trivial to calculate for these compiler-generated statics. So, instead, the easiest thing to do is to make `__rust_alloc_error_handler_should_panic` a function instead. Since `__rust_alloc_error_handler_should_panic` isn't involved in any linking shenanigans, I've marked it as `AlwaysInline` with the hopes that the various backends will see that it is just returning a constant and perform the same optimizations as the previous implementation. r? `@bjorn3`
2025-07-03Always use the pure Rust fallback instead of `llvm.{maximum,minimum}`Urgau-12/+14
2025-07-03Make __rust_alloc_error_handler_should_panic a functionDaniel Paoliello-8/+37
2025-07-03Merge run_fat_lto, optimize_fat and autodiff into run_and_optimize_fat_ltobjorn3-28/+15
2025-07-03Remove unused config param from WriteBackendMethods::autodiffbjorn3-4/+1
2025-07-03Move dcx creation into WriteBackendMethods::codegenbjorn3-3/+4
2025-07-03Remove LtoModuleCodegenbjorn3-13/+10
Most uses of it either contain a fat or thin lto module. Only WorkItem::LTO could contain both, but splitting that enum variant doesn't complicate things much.
2025-07-03setup CI and tidy to use typos for spellchecking and fix few typosklensy-4/+4
2025-07-01Rollup merge of #143125 - tgross35:aarch64-neon-llvm19-f16, r=cuviperMatthias Krüger-0/+8
Disable f16 on Aarch64 without neon for llvm < 20.1.1 This check was added unconditionally in c51b229140 ("Disable f16 on Aarch64 without `neon`") and reverted in 4a8d35709e ("Revert "Disable `f16` on Aarch64 without `neon`"") since it did not fail in Rust's build. However, it is still possible to hit this crash if using LLVM 19 built with assertions, so disable the type conditionally based on version here. Note that for these builds, a similar patch is needed in the build script for `compiler-builtins` since it does not yet use `cfg(target_has_reliable_f16)` (hopefully to be resolved in the near future). Report: https://github.com/rust-lang/rust/pull/139276#issuecomment-3014781652 Original LLVM issue: https://github.com/llvm/llvm-project/issues/129394
2025-06-30Rollup merge of #143140 - RalfJung:ptr-into-parts, r=oli-obkMatthias Krüger-1/+1
give Pointer::into_parts a more scary name and offer a safer alternative `into_parts` is a bit too innocent of a name for a somewhat subtle operation. r? `@oli-obk`
2025-06-30Disable f16 on Aarch64 without neon for llvm < 20.1.1Trevor Gross-0/+8
This check was added unconditionally in c51b229140 ("Disable f16 on Aarch64 without `neon`") and reverted in 4a8d35709e ("Revert "Disable `f16` on Aarch64 without `neon`"") since it did not fail in Rust's build. However, it is still possible to hit this crash if using LLVM 19 built with assertions, so disable the type conditionally based on version here. Note that for these builds, a similar patch is needed in the build script for `compiler-builtins` since it does not yet use `cfg(target_has_reliable_f16)` (hopefully to be resolved in the near future). Report: https://www.github.com/rust-lang/rust/pull/139276#issuecomment-3014781652 Original LLVM issue: https://www.github.com/llvm/llvm-project/issues/129394
2025-06-29Rollup merge of #142078 - sayantn:more-intrinsics, r=workingjubileeGuillaume Gomez-1/+16
Add SIMD funnel shift and round-to-even intrinsics This PR adds 3 new SIMD intrinsics - `simd_funnel_shl` - funnel shift left - `simd_funnel_shr` - funnel shift right - `simd_round_ties_even` (vector version of `round_ties_even_fN`) TODO (future PR): implement `simd_fsh{l,r}` in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the `-n&31` trick used by cg_gcc to implement rotate doesn't work with this because then `fshl(a, b, 0)` will be `a | b`) [#t-compiler > More SIMD intrinsics](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/More.20SIMD.20intrinsics/with/522130286) `@rustbot` label T-compiler T-libs A-intrinsics F-core_intrinsics r? `@workingjubilee`