about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm
AgeCommit message (Collapse)AuthorLines
2025-04-05Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwcoMatthias Krüger-0/+22
KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05KCFI: Add KCFI arity indicator supportRamon de C Valle-0/+22
Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obkStuart Cook-46/+196
Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283
2025-04-04add new flag to print the module post-AD, before optsManuel Drehwald-2/+10
2025-04-04add autodiff batching backendManuel Drehwald-44/+186
2025-04-04Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkinMatthias Krüger-5/+5
Rename `is_like_osx` to `is_like_darwin` Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.). ``@rustbot`` label O-apple r? compiler
2025-04-02Rollup merge of #138003 - sayantn:new-amx, r=AmanieuStuart Cook-0/+7
Add the new `amx` target features and the `movrs` target feature Adds 5 new `amx` target features included in LLVM20. These are guarded under `x86_amx_intrinsics` (#126622) - `amx-avx512` - `amx-fp8` - `amx-movrs` - `amx-tf32` - `amx-transpose` Adds the `movrs` target feature (from #137976). `@rustbot` label O-x86_64 O-x86_32 T-compiler A-target-feature r? `@Amanieu`
2025-03-30Auto merge of #138742 - taiki-e:riscv-vector, r=Amanieubors-1/+3
rustc_target: Add more RISC-V vector-related features and use zvl*b target features in vector ABI check Currently, we have only unstable `v` target feature, but RISC-V have more vector-related extensions. The first commit of this PR adds them to unstable `riscv_target_feature`. - `unaligned-vector-mem`: Has reasonably performant unaligned vector - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L1379) - Similar to currently unstable `unaligned-scalar-mem` target feature, but for vector instructions. - `zvfh`: Vector Extension for Half-Precision Floating-Point - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvfh-vector-extension-for-half-precision-floating-point) - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L668) - This implies `zvfhmin` and `zfhmin` - `zvfhmin`: Vector Extension for Minimal Half-Precision Floating-Point - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvfhmin-vector-extension-for-minimal-half-precision-floating-point) - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L662) - This implies `zve32f` - `zve32x`, `zve32f`, `zve64x`, `zve64f`, `zve64d`: Vector Extensions for Embedded Processors - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zve-vector-extensions-for-embedded-processors) - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L612-L641) - `zve32x` implies `zvl32b` - `zve32f` implies `zve32x` and `f` - `zve64x` implies `zve32x` and `zvl64b` - `zve64f` implies `zve32f` and `zve64x` - `zve64d` implies `zve64f` and `d` - `v` implies `zve64d` - `zvl*b`: Minimum Vector Length Standard Extensions - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvl-minimum-vector-length-standard-extensions) - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L600-L610) - `zvl{N}b` implies `zvl{N>>1}b` - `v` implies `zvl128b` - Vector Cryptography and Bit-manipulation Extensions - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/vector-crypto.adoc) - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L679-L807) - `zvkb`: Vector Bit-manipulation used in Cryptography - This implies `zve32x` - `zvbb`: Vector basic bit-manipulation instructions - This implies `zvkb` - `zvbc`: Vector Carryless Multiplication - This implies `zve64x` - `zvkg`: Vector GCM instructions for Cryptography - This implies `zve32x` - `zvkned`: Vector AES Encryption & Decryption (Single Round) - This implies `zve32x` - `zvknha`: Vector SHA-2 (SHA-256 only)) - This implies `zve32x` - `zvknhb`: Vector SHA-2 (SHA-256 and SHA-512) - This implies `zve64x` - This is superset of `zvknha`, but doesn't imply that feature at least in LLVM - `zvksed`: SM4 Block Cipher Instructions - This implies `zve32x` - `zvksh`: SM3 Hash Function Instructions - This implies `zve32x` - `zvkt`: Vector Data-Independent Execution Latency - Similar to already stabilized scalar cryptography extension `zkt`. - `zvkn`: Shorthand for 'Zvkned', 'Zvknhb', 'Zvkb', and 'Zvkt' - Similar to already stabilized scalar cryptography extension `zkn`. - `zvknc`: Shorthand for 'Zvkn' and 'Zvbc' - `zvkng`: shorthand for 'Zvkn' and 'Zvkg' - `zvks`: shorthand for 'Zvksed', 'Zvksh', 'Zvkb', and 'Zvkt' - Similar to already stabilized scalar cryptography extension `zks`. - `zvksc`: shorthand for 'Zvks' and 'Zvbc' - `zvksg`: shorthand for 'Zvks' and 'Zvkg' Also, our vector ABI check wants `zvl*b` target features, the second commit of this PR updates vector ABI check to use them. https://github.com/rust-lang/rust/blob/4e2b096ed6c8a1400624a54f6c4fd0c0ce48a579/compiler/rustc_target/src/target_features.rs#L707-L708 --- r? `@Amanieu` `@rustbot` label +O-riscv +A-target-feature
2025-03-28Auto merge of #138503 - bjorn3:string_merging, r=tmiaskobors-1/+6
Avoid wrapping constant allocations in packed structs when not necessary This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes. try-job: armhf-gnu
2025-03-28Add test and commentbjorn3-0/+5
2025-03-28Avoid wrapping constant allocations in packed structs when not necessarybjorn3-1/+1
This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes.
2025-03-26Auto merge of #138893 - klensy:thorin-0.9, r=Mark-Simulacrumbors-1/+1
bump thorin to 0.9 to drop duped deps Bumps `thorin`, removing duped deps. This also changes features for hashbrown: ``` hashbrown v0.15.2 `-- indexmap v2.7.0 |-- object v0.36.7 |-- wasmparser v0.219.1 |-- wasmparser v0.223.0 `-- wit-component v0.223.0 |-- indexmap feature "default" |-- indexmap feature "serde" `-- indexmap feature "std" |-- hashbrown feature "default-hasher" | |-- object v0.36.7 (*) | `-- wasmparser v0.223.0 (*) |-- hashbrown feature "nightly" | |-- rustc_data_structures v0.0.0 | `-- rustc_query_system v0.0.0 `-- hashbrown feature "serde" `-- wasmparser feature "serde" ``` to ``` hashbrown v0.15.2 `-- indexmap v2.7.0 |-- object v0.36.7 |-- wasmparser v0.219.1 |-- wasmparser v0.223.0 `-- wit-component v0.223.0 |-- indexmap feature "default" |-- indexmap feature "serde" `-- indexmap feature "std" |-- hashbrown feature "allocator-api2" | `-- hashbrown feature "default" |-- hashbrown feature "default" (*) |-- hashbrown feature "default-hasher" | |-- object v0.36.7 (*) | `-- wasmparser v0.223.0 (*) | `-- hashbrown feature "default" (*) |-- hashbrown feature "equivalent" | `-- hashbrown feature "default" (*) |-- hashbrown feature "inline-more" | `-- hashbrown feature "default" (*) |-- hashbrown feature "nightly" | |-- rustc_data_structures v0.0.0 | `-- rustc_query_system v0.0.0 |-- hashbrown feature "raw-entry" | `-- hashbrown feature "default" (*) `-- hashbrown feature "serde" `-- wasmparser feature "serde" ``` To be safe, as this can be perf-sensitive: `@bors` rollup=never
2025-03-25Rename `is_like_osx` to `is_like_darwin`Mads Marquart-5/+5
2025-03-25Rollup merge of #137247 - dpaoliello:cleanllvm, r=ZalatharMatthias Krüger-136/+144
cg_llvm: Reduce the visibility of types, modules and using declarations in `rustc_codegen_llvm`. Final part of #135502 Reduces the visibility of types, modules and using declarations in the `rustc_codegen_llvm` to private or `pub(crate)` where possible, and marks unused fields and enum entries with `#[expect(dead_code)]`. r? Zalathar
2025-03-25Reduce visibility of most items in `rustc_codegen_llvm`Daniel Paoliello-136/+144
2025-03-24Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcmbors-0/+42
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics. These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19. I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something. r? `@scottmcm`
2025-03-24bump thorin to drop duped depsklensy-1/+1
2025-03-21Rollup merge of #138627 - EnzymeAD:autodiff-cleanups, r=oli-obkMatthias Krüger-104/+110
Autodiff cleanups Splitting out some cleanups to reduce the size of my batching PR and simplify ``@haenoe`` 's [PR](https://github.com/rust-lang/rust/pull/138314). r? ``@oli-obk`` Tracking: - https://github.com/rust-lang/rust/issues/124509
2025-03-20rustc_target: Add more RISC-V vector-related featuresTaiki Endo-1/+3
2025-03-20coverage: Convert and check span coordinates without a local file IDZalathar-27/+40
For expansion region support, we will want to be able to convert and check spans before creating a corresponding local file ID. If we create local file IDs eagerly, but some expansion turns out to have no successfully-converted spans, LLVM will complain about that expansion's file ID having no regions.
2025-03-20coverage: Add LLVM plumbing for expansion regionsZalathar-7/+41
This is currently unused, but paves the way for future work on expansion regions without having to worry about the FFI parts.
2025-03-19Rollup merge of #138674 - oli-obk:llvm-cleanups, r=compiler-errorsMatthias Krüger-205/+206
Various codegen_llvm cleanups Mostly just adding safe wrappers and deduplicating code
2025-03-18Create a safe wrapper around `LLVMRustDIBuilderCreateMemberType`Oli Scherer-42/+60
2025-03-18Avoid splitting up a layoutOli Scherer-22/+24
2025-03-18coverage: Don't store a body span in `FunctionCoverageInfo`Zalathar-2/+7
2025-03-18coverage: Don't refer to the body span when enlarging empty spansZalathar-26/+9
Given that we now only enlarge empty spans to "{" or "}", there shouldn't be any danger of enlarging beyond a function body.
2025-03-17[NFC] simplify matchingManuel Drehwald-12/+3
2025-03-17[NFC] extract autodiff call lowering in cg_llvm into own functionManuel Drehwald-93/+108
2025-03-17Auto merge of #127173 - bjorn3:mangle_rustc_std_internal_symbol, ↵bors-10/+16
r=wesleywiser,jieyouxu Mangle rustc_std_internal_symbols functions This reduces the risk of issues when using a staticlib or rust dylib compiled with a different rustc version in a rust program. Currently this will either (in the case of staticlib) cause a linker error due to duplicate symbol definitions, or (in the case of rust dylibs) cause rustc_std_internal_symbols functions to be silently overridden. As rust gets more commonly used inside the implementation of libraries consumed with a C interface (like Spidermonkey, Ruby YJIT (curently has to do partial linking of all rust code to hide all symbols not part of the C api), the Rusticl OpenCL implementation in mesa) this is becoming much more of an issue. With this PR the only symbols remaining with an unmangled name are rust_eh_personality (LLVM doesn't allow renaming it) and `__rust_no_alloc_shim_is_unstable`. Helps mitigate https://github.com/rust-lang/rust/issues/104707 try-job: aarch64-gnu-debug try-job: aarch64-apple try-job: x86_64-apple-1 try-job: x86_64-mingw-1 try-job: i686-mingw-1 try-job: x86_64-msvc-1 try-job: i686-msvc-1 try-job: test-various try-job: armhf-gnu
2025-03-17Create a safe wrapper around `LLVMRustDIBuilderCreateBasicType`Oli Scherer-31/+36
2025-03-17Create a safe wrapper function around `LLVMRustDIBuilderCreateFile`Oli Scherer-33/+26
2025-03-17Create a safe wrapper around `LLVMRustDIBuilderCreateSubroutineType`Oli Scherer-12/+13
2025-03-17Deduplicate template parameter creationOli Scherer-45/+26
2025-03-17Immediately create an `Option` instead of reallocating for it laterOli Scherer-6/+6
2025-03-17Create a safe wrapper around LLVMRustDIBuilderCreateTemplateTypeParameterOli Scherer-23/+24
2025-03-17Rollup merge of #138349 - 1c3t3a:external-weak-cfi, r=rcvalleMatthias Krüger-4/+24
Emit function declarations for functions with `#[linkage="extern_weak"]` Currently, when declaring an extern weak function in Rust, we use the following syntax: ```rust unsafe extern "C" { #[linkage = "extern_weak"] static FOO: Option<unsafe extern "C" fn() -> ()>; } ``` This allows runtime-checking the extern weak symbol through the Option. When emitting LLVM-IR, the Rust compiler currently emits this static as an i8, and a pointer that is initialized with the value of the global i8 and represents the nullabilty e.g. ``` `@FOO` = extern_weak global i8 `@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO` ``` This approach does not work well with CFI, where we need to attach CFI metadata to a concrete function declaration, which was pointed out in https://github.com/rust-lang/rust/issues/115199. This change switches to emitting a proper function declaration instead of a global i8. This allows CFI to work for extern_weak functions. Example: ``` `@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO` ... declare !type !61 !type !62 !type !63 !type !64 extern_weak void `@FOO(double)` unnamed_addr #6 ``` We keep initializing the Rust internal symbol with the function declaration, which preserves the correct behavior for runtime checking the Option. r? `@rcvalle` cc `@jakos-sec` try-job: test-various
2025-03-17Remove implicit #[no_mangle] for #[rustc_std_internal_symbol]bjorn3-10/+16
2025-03-17fix(debuginfo): avoid overflow when handling expanding recursive typeAdwin White-0/+50
2025-03-17Emit function declarations for functions with #[linkage="extern_weak"]Bastian Kersting-4/+24
Currently, when declaring an extern weak function in Rust, we use the following syntax: ```rust unsafe extern "C" { #[linkage = "extern_weak"] static FOO: Option<unsafe extern "C" fn() -> ()>; } ``` This allows runtime-checking the extern weak symbol through the Option. When emitting LLVM-IR, the Rust compiler currently emits this static as an i8, and a pointer that is initialized with the value of the global i8 and represents the nullabilty e.g. ``` @FOO = extern_weak global i8 @_rust_extern_with_linkage_FOO = internal global ptr @FOO ``` This approach does not work well with CFI, where we need to attach CFI metadata to a concrete function declaration, which was pointed out in https://github.com/rust-lang/rust/issues/115199. This change switches to emitting a proper function declaration instead of a global i8. This allows CFI to work for extern_weak functions. We keep initializing the Rust internal symbol with the function declaration, which preserves the correct behavior for runtime checking the Option. Co-authored-by: Jakob Koschel <jakobkoschel@google.com>
2025-03-16Auto merge of #137011 - LuuuXXX:promote-ohos-with-host-tools, r=Amanieubors-1/+1
Promote ohos targets to tier2 with host tools. ### What does this PR try to resolve? Try to promote the following [[Tier 2 without Host Tools](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-without-host-tools)](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-without-host-tools) targets to [[Tier 2 with Host Tools](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-with-host-tools)](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-with-host-tools): - `aarch64-unknown-linux-ohos` - `armv7-unknown-linux-ohos` - `x86_64-unknown-linux-ohos` ### More Information? see MCP: https://github.com/rust-lang/compiler-team/issues/811 ### Blockage to be solved? - [x] Submit an MCP - [x] Submit code of promote ohos targets - [x] Resolve related dependencies (`measureme`) The modified code of the measureme has been merged (see https://github.com/rust-lang/measureme/pull/238). [done] The new version will was released (https://github.com/rust-lang/measureme/pull/240). [done]
2025-03-12Rollup merge of #138331 - nnethercote:use-RUSTC_LINT_FLAGS-more, ↵Matthias Krüger-1/+0
r=onur-ozkan,jieyouxu Use `RUSTC_LINT_FLAGS` more An alternative to the failed #138084. Fixes #138106. r? ````@jieyouxu````
2025-03-11Auto merge of #137586 - nnethercote:SetImpliedBits, r=bjorn3bors-59/+60
Speed up target feature computation The LLVM backend calls `LLVMRustHasFeature` twice for every feature. In short-running rustc invocations, this accounts for a surprising amount of work. r? `@bjorn3`
2025-03-11Remove `#![warn(unreachable_pub)]` from all `compiler/` crates.Nicholas Nethercote-1/+0
It's no longer necessary now that `-Wunreachable_pub` is being passed.
2025-03-10Revert "Use workspace lints for crates in `compiler/` #138084"许杰友 Jieyou Xu (Joe)-3/+1
Revert <https://github.com/rust-lang/rust/pull/138084> to buy time to consider options that avoids breaking downstream usages of cargo on distributed `rustc-src` artifacts, where such cargo invocations fail due to inability to inherit `lints` from workspace root manifest's `workspace.lints` (this is only valid for the source rust-lang/rust workspace, but not really the distributed `rustc-src` artifacts). This breakage was reported in <https://github.com/rust-lang/rust/issues/138304>. This reverts commit 48caf81484b50dca5a5cebb614899a3df81ca898, reversing changes made to c6662879b27f5161e95f39395e3c9513a7b97028.
2025-03-09Rollup merge of #122790 - Zoxc:dllimp-rev, r=ChrisDentonMatthias Krüger-6/+5
Apply dllimport in ThinLTO This partially reverts https://github.com/rust-lang/rust/pull/103353 by properly applying `dllimport` if `-Z dylib-lto` is passed. That PR should probably fully be reverted as it looks quite sketchy. We don't know locally if the entire crate graph would be statically linked. This should hopefully be sufficient to make ThinLTO work for rustc on Windows. r? ``@wesleywiser`` --- Edit: This PR is changed to just generally revert https://github.com/rust-lang/rust/pull/103353.
2025-03-09Rollup merge of #138084 - nnethercote:workspace-lints, r=jieyouxuMatthias Krüger-1/+3
Use workspace lints for crates in `compiler/` This is nicer and hopefully less error prone than specifying lints via bootstrap. r? ``@jieyouxu``
2025-03-08Remove `#![warn(unreachable_pub)]` from all `compiler/` crates.Nicholas Nethercote-1/+0
(Except for `rustc_codegen_cranelift`.) It's no longer necessary now that `unreachable_pub` is in the workspace lints.
2025-03-08Specify rust lints for `compiler/` crates via Cargo.Nicholas Nethercote-0/+3
By naming them in `[workspace.lints.rust]` in the top-level `Cargo.toml`, and then making all `compiler/` crates inherit them with `[lints] workspace = true`. (I omitted `rustc_codegen_{cranelift,gcc}`, because they're a bit different.) The advantages of this over the current approach: - It uses a standard Cargo feature, rather than special handling in bootstrap. So, easier to understand, and less likely to get accidentally broken in the future. - It works for proc macro crates. It's a shame it doesn't work for rustc-specific lints, as the comments explain.
2025-03-07Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwcoMatthias Krüger-348/+316
Clean up various LLVM FFI things in codegen_llvm cc ```@ZuseZ4``` I touched some autodiff parts The major change of this PR is [bfd88ce](https://github.com/rust-lang/rust/pull/137549/commits/bfd88cead0dd79717f123ad7e9a26ecad88653cb) which makes `CodegenCx` generic just like `GenericBuilder` The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks. best reviewed commit-by-commit
2025-03-06Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsicsDaniPopes-0/+42
Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.