about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src/intrinsic.rs
AgeCommit message (Collapse)AuthorLines
2025-09-29Rollup merge of #147134 - workingjubilee:remove-explicit-abialign-deref, ↵Stuart Cook-1/+1
r=Zalathar remove explicit deref of AbiAlign for most methods Much of the compiler calls functions on Align projected from AbiAlign. AbiAlign impls Deref to its inner Align, so we can simplify these away. Also, it will minimize disruption when AbiAlign is removed. For now, preserve usages that might resolve to PartialOrd or PartialEq, as those have odd inference.
2025-09-29Rollup merge of #147116 - workingjubilee:remove-tdl-abialign, r=ZalatharStuart Cook-1/+1
compiler: remove AbiAlign inside TargetDataLayout AbiAlign is a thin wrapper around Align, extant mostly because we used to track a separate quasi-notion of alignment that was never a real notion of alignment and removing all of it at once was too churny. This PR maintains AbiAlign usage in public API and most of the compiler, but direct access of these fields for TargetDataLayout is now in terms of Align only.
2025-09-28remove explicit deref of AbiAlign for most methodsJubilee Young-1/+1
Much of the compiler calls functions on Align projected from AbiAlign. AbiAlign impls Deref to its inner Align, so we can simplify these away. Also, it will minimize disruption when AbiAlign is removed. For now, preserve usages that might resolve to PartialOrd or PartialEq, as those have odd inference.
2025-09-28Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4Matthias Krüger-0/+4
TypeTree support in autodiff # TypeTrees for Autodiff ## What are TypeTrees? Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently. ## Structure ```rust TypeTree(Vec<Type>) Type { offset: isize, // byte offset (-1 = everywhere) size: usize, // size in bytes kind: Kind, // Float, Integer, Pointer, etc. child: TypeTree // nested structure } ``` ## Example: `fn compute(x: &f32, data: &[f32]) -> f32` **Input 0: `x: &f32`** ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) }]) ``` **Input 1: `data: &[f32]`** ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, // -1 = all elements child: TypeTree::new() }]) }]) ``` **Output: `f32`** ```rust TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) ``` ## Why Needed? - Enzyme can't deduce complex type layouts from LLVM IR - Prevents slow memory pattern analysis - Enables correct derivative computation for nested structures - Tells Enzyme which bytes are differentiable vs metadata ## What Enzyme Does With This Information: Without TypeTrees (current state): ```llvm ; Enzyme sees generic LLVM IR: define float ``@distance(ptr*`` %p1, ptr* %p2) { ; Has to guess what these pointers point to ; Slow analysis of all memory operations ; May miss optimization opportunities } ``` With TypeTrees (our implementation): ```llvm define "enzyme_type"="{[]:Float@float}" float ``@distance(`` ptr "enzyme_type"="{[]:Pointer}" %p1, ptr "enzyme_type"="{[]:Pointer}" %p2 ) { ; Enzyme knows exact type layout ; Can generate efficient derivative code directly } ``` # TypeTrees - Offset and -1 Explained ## Type Structure ```rust Type { offset: isize, // WHERE this type starts size: usize, // HOW BIG this type is kind: Kind, // WHAT KIND of data (Float, Int, Pointer) child: TypeTree // WHAT'S INSIDE (for pointers/containers) } ``` ## Offset Values ### Regular Offset (0, 4, 8, etc.) **Specific byte position within a structure** ```rust struct Point { x: f32, // offset 0, size 4 y: f32, // offset 4, size 4 id: i32, // offset 8, size 4 } ``` TypeTree for `&Point` (internal representation): ```rust TypeTree(vec![ Type { offset: 0, size: 4, kind: Float }, // x at byte 0 Type { offset: 4, size: 4, kind: Float }, // y at byte 4 Type { offset: 8, size: 4, kind: Integer } // id at byte 8 ]) ``` Generates LLVM: ```llvm "enzyme_type"="{[]:Float@float}" ``` ### Offset -1 (Special: "Everywhere") **Means "this pattern repeats for ALL elements"** #### Example 1: Array `[f32; 100]` ```rust TypeTree(vec![Type { offset: -1, // ALL positions size: 4, // each f32 is 4 bytes kind: Float, // every element is float }]) ``` Instead of listing 100 separate Types with offsets `0,4,8,12...396` #### Example 2: Slice `&[i32]` ```rust // Pointer to slice data TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, // ALL slice elements size: 4, // each i32 is 4 bytes kind: Integer }]) }]) ``` #### Example 3: Mixed Structure ```rust struct Container { header: i64, // offset 0 data: [f32; 1000], // offset 8, but elements use -1 } ``` ```rust TypeTree(vec![ Type { offset: 0, size: 8, kind: Integer }, // header Type { offset: 8, size: 4000, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float // ALL array elements }]) } ]) ```
2025-09-27compiler: remove AbiAlign inside TargetDataLayoutJubilee Young-1/+1
This maintains AbiAlign usage in public API and most of the compiler, but direct access of these fields is now in terms of Align only.
2025-09-21Add panic=immediate-abortBen Kimock-2/+1
2025-09-19Add TypeTree metadata attachment for autodiffKaran Janthe-0/+4
- Add F128 support to TypeTree Kind enum - Implement TypeTree FFI bindings and conversion functions - Add typetree.rs module for metadata attachment to LLVM functions - Integrate TypeTree generation with autodiff intrinsic pipeline - Support scalar types: f32, f64, integers, f16, f128 - Attach enzyme_type attributes as LLVM string metadata for Enzyme Signed-off-by: Karan Janthe <karanjanthe@gmail.com>
2025-09-18Auto merge of #142544 - Sa4dUs:prevent-abi-changes, r=ZuseZ4bors-1/+2
Prevent ABI changes affect EnzymeAD This PR handles ABI changes for autodiff input arguments to improve Enzyme compatibility. Fundamentally this adjusts activities when a function argument is lowered as an `ScalarPair`, so there's no mismatch between diff activities and args. Also removes activities corresponding to ZSTs. fixes: https://github.com/rust-lang/rust/issues/144025 r? `@ZuseZ4`
2025-09-17Check ZST via `PassMode`Marcelo Domínguez-1/+2
2025-09-03Add `funnel_sh{l,r}` functions and intrinsicssayantn-8/+18
- Add a fallback implementation for the intrinsics - Add LLVM backend support for funnel shifts Co-Authored-By: folkertdev <folkert@folkertdev.nl>
2025-08-21remove an `as` cast in prefetch codegenFolkert de Vries-1/+1
2025-08-20make `prefetch` intrinsics safeFolkert de Vries-1/+7
2025-08-14Complete functionality and general cleanupMarcelo Domínguez-68/+144
2025-08-14Basic implementation of `autodiff` intrinsicMarcelo Domínguez-3/+73
2025-07-25Unify LLVM ctlz/cttz intrinsic generationTobias Decking-19/+9
2025-07-07compiler: Parse `p-` specs in datalayout string, allow definition of custom ↵Edoardo Marangoni-9/+11
default data address space
2025-07-03Always use the pure Rust fallback instead of `llvm.{maximum,minimum}`Urgau-12/+14
2025-07-03setup CI and tidy to use typos for spellchecking and fix few typosklensy-2/+2
2025-06-29Rollup merge of #142078 - sayantn:more-intrinsics, r=workingjubileeGuillaume Gomez-1/+16
Add SIMD funnel shift and round-to-even intrinsics This PR adds 3 new SIMD intrinsics - `simd_funnel_shl` - funnel shift left - `simd_funnel_shr` - funnel shift right - `simd_round_ties_even` (vector version of `round_ties_even_fN`) TODO (future PR): implement `simd_fsh{l,r}` in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the `-n&31` trick used by cg_gcc to implement rotate doesn't work with this because then `fshl(a, b, 0)` will be `a | b`) [#t-compiler > More SIMD intrinsics](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/More.20SIMD.20intrinsics/with/522130286) `@rustbot` label T-compiler T-libs A-intrinsics F-core_intrinsics r? `@workingjubilee`
2025-06-15Correctly account for different address spaces in LLVM intrinsic invocationssayantn-17/+22
2025-06-15Use `LLVMIntrinsicGetDeclaration` to completely remove the hardcoded ↵sayantn-4/+4
intrinsics list
2025-06-15Add `simd_funnel_sh{l,r}` and `simd_round_ties_even`sayantn-1/+16
2025-06-12Simplify implementation of Rust intrinsics by using type parameters in the cachesayantn-359/+232
2025-06-03Remove type_test from IntrinsicCallBuilderMethodsbjorn3-7/+0
It is only used within cg_llvm.
2025-05-30Directly use from_immediate for handling boolbjorn3-3/+2
2025-05-30Avoid computing function type for intrinsic instancesbjorn3-6/+2
2025-05-30Use layout field of OperandRef in generic_simd_intrinsicbjorn3-32/+26
2025-05-30Use layout field of OperandRef and PlaceRef in codegen_intrinsic_callbjorn3-23/+17
This avoids having to get the function signature.
2025-05-26Remove usage of FnAbi in codegen_intrinsic_callbjorn3-20/+10
2025-05-26Pass PlaceRef rather than Bx::Value to codegen_intrinsic_callbjorn3-28/+22
2025-05-10Use the fallback body for `{minimum,maximum}f128` on LLVM as well.Urgau-4/+6
2025-05-09Use intrinsics for `{f16,f32,f64,f128}::{minimum,maximum}` operationsUrgau-0/+10
2025-05-05Rename Instance::new to Instance::new_raw and add a note that it is rawMichael Goulet-1/+1
2025-04-20Rollup merge of #137953 - RalfJung:simd-intrinsic-masks, r=WaffleLapkinChris Denton-44/+12
simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first... Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`. I have extremely low confidence in the GCC part of this PR. See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)
2025-04-20simd intrinsics with mask: accept unsigned integer masksRalf Jung-44/+12
2025-04-11Rollup merge of #137447 - folkertdev:simd-extract-insert-dyn, r=scottmcmStuart Cook-30/+39
add `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}` fixes https://github.com/rust-lang/rust/issues/137372 adds `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`, which contrary to their non-dyn counterparts allow a non-const index. Many platforms (but notably not x86_64 or aarch64) have dedicated instructions for this operation, which stdarch can emit with this change. Future work is to also make the `Index` operation on the `Simd` type emit this operation, but the intrinsic can't be used directly. We'll need some MIR shenanigans for that. r? `@ghost`
2025-04-10add `simd_insert_dyn` and `simd_extract_dyn`Folkert de Vries-30/+39
2025-03-17Remove implicit #[no_mangle] for #[rustc_std_internal_symbol]bjorn3-1/+5
2025-03-07Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwcoMatthias Krüger-2/+2
Clean up various LLVM FFI things in codegen_llvm cc ```@ZuseZ4``` I touched some autodiff parts The major change of this PR is [bfd88ce](https://github.com/rust-lang/rust/pull/137549/commits/bfd88cead0dd79717f123ad7e9a26ecad88653cb) which makes `CodegenCx` generic just like `GenericBuilder` The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks. best reviewed commit-by-commit
2025-02-28rename BackendRepr::Vector → SimdVectorRalf Jung-1/+1
2025-02-25Rollup merge of #137595 - folkertdev:remove-simd-pow-powi, r=RalfJungLeón Orell Valerian Liehr-4/+0
remove `simd_fpow` and `simd_fpowi` Discussed in https://github.com/rust-lang/rust/issues/137555 These functions are not exposed from `std::intrinsics::simd`, and not used anywhere outside of the compiler. They also don't lower to particularly good code at least on the major ISAs (I checked x86_64, aarch64, s390x, powerpc), where the vector is just spilled to the stack and scalar functions are used for the actual logic. r? `@RalfJung`
2025-02-25remove `simd_fpow` and `simd_fpowi`Folkert de Vries-4/+0
2025-02-24rename simd_shuffle_generic → simd_shuffle_const_genericRalf Jung-1/+1
2025-02-24Use a safe wrapper around an LLVM FFI functionOli Scherer-2/+2
2025-02-23Rollup merge of #136543 - RalfJung:round-ties-even, r=tgross35Trevor Gross-14/+8
intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that. Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35` try-job: test-various
2025-02-20Remove `BackendRepr::Uninhabited`, replaced with an `uninhabited: bool` ↵Zachary S-1/+1
field in `LayoutData`. Also update comments that refered to BackendRepr::Uninhabited.
2025-02-18compiler: Stop reexporting stuff in cg_llvm::abiJubilee Young-6/+6
The reexports confuse tooling like rustdoc into thinking cg_llvm is the source of key types that originate in rustc_target.
2025-02-11Document some safety constraints and use more safe wrappersOli Scherer-1/+1
2025-02-08Rustfmtbjorn3-231/+240
2025-02-04intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even ↵Ralf Jung-14/+8
intrinsic