about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src/abi.rs
AgeCommit message (Collapse)AuthorLines
2025-09-28Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4Matthias Krüger-0/+1
TypeTree support in autodiff # TypeTrees for Autodiff ## What are TypeTrees? Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently. ## Structure ```rust TypeTree(Vec<Type>) Type { offset: isize, // byte offset (-1 = everywhere) size: usize, // size in bytes kind: Kind, // Float, Integer, Pointer, etc. child: TypeTree // nested structure } ``` ## Example: `fn compute(x: &f32, data: &[f32]) -> f32` **Input 0: `x: &f32`** ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) }]) ``` **Input 1: `data: &[f32]`** ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, // -1 = all elements child: TypeTree::new() }]) }]) ``` **Output: `f32`** ```rust TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) ``` ## Why Needed? - Enzyme can't deduce complex type layouts from LLVM IR - Prevents slow memory pattern analysis - Enables correct derivative computation for nested structures - Tells Enzyme which bytes are differentiable vs metadata ## What Enzyme Does With This Information: Without TypeTrees (current state): ```llvm ; Enzyme sees generic LLVM IR: define float ``@distance(ptr*`` %p1, ptr* %p2) { ; Has to guess what these pointers point to ; Slow analysis of all memory operations ; May miss optimization opportunities } ``` With TypeTrees (our implementation): ```llvm define "enzyme_type"="{[]:Float@float}" float ``@distance(`` ptr "enzyme_type"="{[]:Pointer}" %p1, ptr "enzyme_type"="{[]:Pointer}" %p2 ) { ; Enzyme knows exact type layout ; Can generate efficient derivative code directly } ``` # TypeTrees - Offset and -1 Explained ## Type Structure ```rust Type { offset: isize, // WHERE this type starts size: usize, // HOW BIG this type is kind: Kind, // WHAT KIND of data (Float, Int, Pointer) child: TypeTree // WHAT'S INSIDE (for pointers/containers) } ``` ## Offset Values ### Regular Offset (0, 4, 8, etc.) **Specific byte position within a structure** ```rust struct Point { x: f32, // offset 0, size 4 y: f32, // offset 4, size 4 id: i32, // offset 8, size 4 } ``` TypeTree for `&Point` (internal representation): ```rust TypeTree(vec![ Type { offset: 0, size: 4, kind: Float }, // x at byte 0 Type { offset: 4, size: 4, kind: Float }, // y at byte 4 Type { offset: 8, size: 4, kind: Integer } // id at byte 8 ]) ``` Generates LLVM: ```llvm "enzyme_type"="{[]:Float@float}" ``` ### Offset -1 (Special: "Everywhere") **Means "this pattern repeats for ALL elements"** #### Example 1: Array `[f32; 100]` ```rust TypeTree(vec![Type { offset: -1, // ALL positions size: 4, // each f32 is 4 bytes kind: Float, // every element is float }]) ``` Instead of listing 100 separate Types with offsets `0,4,8,12...396` #### Example 2: Slice `&[i32]` ```rust // Pointer to slice data TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, // ALL slice elements size: 4, // each i32 is 4 bytes kind: Integer }]) }]) ``` #### Example 3: Mixed Structure ```rust struct Container { header: i64, // offset 0 data: [f32; 1000], // offset 8, but elements use -1 } ``` ```rust TypeTree(vec![ Type { offset: 0, size: 8, kind: Integer }, // header Type { offset: 8, size: 4000, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float // ALL array elements }]) } ]) ```
2025-09-25Use standard attribute logic for allocator shimNikita Popov-1/+7
Use llfn_attrs_from_instance() to generate the attributes for the allocator shim. This ensures that we generate all the usual attributes (and don't get to find out one-by-one that a certain attribute is important for a certain target). Additionally this will enable emitting the allocator-specific attributes (not included here). This change is quite awkward because the allocator shim uses SimpleCx, while llfn_attrs_from_instance uses CodegenCx. I've switched it to use SimpleCx plus tcx/sess arguments where necessary. If there's a simpler way to do this, I'd love to know about it...
2025-09-19added typetree support for memcpyKaran Janthe-0/+1
2025-09-12Remove unreachable unsized arg handling in `store_fn_arg/store_arg` in codegenZachary S-8/+3
2025-08-26Use captures(address) instead of captures(none) for indirect argsNikita Popov-3/+5
While provenance cannot be captured through these arguments, the address / object identity can.
2025-08-20Tell LLVM about read-only capturesNikita Popov-1/+6
`&Freeze` parameters are not only `readonly` within the function, but any captures of the pointer can also only be used for reads. This can now be encoded using the `captures(address, read_provenance)` attribute.
2025-08-11Set dead_on_return attribute for indirect argumentsNikita Popov-1/+11
Set the dead_on_return attribute (added in LLVM 21) for arguments that are passed indirectly, but not byval. This indicates that the value of the argument on return does not matter, enabling additional dead store elimination.
2025-07-05use `is_multiple_of` instead of manual moduloFolkert de Vries-1/+1
2025-06-16Fix RISC-V C function ABI when passing/returning structs containing floatsbeetrees-1/+1
2025-06-12add `extern "custom"` functionsFolkert de Vries-0/+4
2025-06-03Rollup merge of #141569 - workingjubilee:canonicalize-abi, r=bjorn3Matthias Krüger-29/+40
Replace ad-hoc ABI "adjustments" with an `AbiMap` to `CanonAbi` Our `conv_from_spec_abi`, `adjust_abi`, and `is_abi_supported` combine to give us a very confusing way of reasoning about what _actual_ calling convention we want to lower our code to and whether we want to compile the resulting code at all. Instead of leaving this code as a miniature adventure game in which someone tries to combine stateful mutations into a Rube Goldberg machine that will let them escape the maze and arrive at the promised land of codegen, we let `AbiMap` devour this complexity. Once you have an `AbiMap`, you can answer which `ExternAbi`s will lower to what `CanonAbi`s (and whether they will lower at all). Removed: - `conv_from_spec_abi` replaced by `AbiMap::canonize_abi` - `adjust_abi` replaced by same - `Conv::PreserveAll` as unused - `Conv::Cold` as unused - `enum Conv` replaced by `enum CanonAbi` target-spec.json changes: - If you have a target-spec.json then now your "entry-abi" key will be specified in terms of one of the `"{abi}"` strings Rust recognizes, e.g. ```json "entry-abi": "C", "entry-abi": "win64", "entry-abi": "aapcs", ```
2025-06-03cg_llvm: convert to CanonAbiJubilee Young-29/+40
2025-05-28Remove unused arg_memory_ty methodbjorn3-10/+0
2025-04-05Update the minimum external LLVM to 19Josh Stone-16/+1
2025-02-24Remove an unused lifetime paramOli Scherer-1/+1
2025-02-18compiler: Stop reexporting stuff in cg_llvm::abiJubilee Young-11/+8
The reexports confuse tooling like rustdoc into thinking cg_llvm is the source of key types that originate in rustc_target.
2025-02-12Rollup merge of #136807 - ↵Jacob Pratt-1/+0
workingjubilee:merge-gpus-to-get-the-arcradeongeforce, r=bjorn3 compiler: internally merge `PtxKernel` into `GpuKernel` r? ``@bjorn3`` for review
2025-02-11Rollup merge of #136721 - dpaoliello:cleanllvm2, r=ZalatharJacob Pratt-1/+1
cg_llvm: Reduce visibility of some items outside the `llvm` module Next piece of #135502 This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.
2025-02-10rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm moduleDaniel Paoliello-1/+1
2025-02-09compiler: internally merge `Conv::PtxKernel` into `GpuKernel`Jubilee Young-1/+0
It is speculated that these two can be conceptually merged, and it can start by ripping out rustc's notion of the PtxKernel call convention. Leave the ExternAbi for now, but the nvptx target now should see it as just a different way to spell Conv::GpuKernel.
2025-02-09Auto merge of #136751 - bjorn3:update_rustfmt, r=Mark-Simulacrumbors-15/+23
Update bootstrap compiler and rustfmt The rustfmt version we previously used formats things differently from what the latest nightly rustfmt does. This causes issues for subtrees that get formatted both in-tree and in their own repo. Updating the rustfmt used in-tree solves those issues. Also bumped the bootstrap compiler as the stage0 update command always updates both at the same time.
2025-02-08Rustfmtbjorn3-15/+23
2025-02-07compiler: remove reexports from rustc_target::callconvJubilee Young-4/+3
2025-01-16Add gpu-kernel calling conventionFlakebi-6/+16
The amdgpu-kernel calling convention was reverted in commit f6b21e90d1ec01081bc2619efb68af6788a63d65 due to inactivity in the amdgpu target. Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.
2024-11-03compiler: Directly use rustc_abi in codegenJubilee Young-3/+3
2024-10-29compiler: `rustc_abi::Abi` => `BackendRepr`Jubilee Young-4/+6
The initial naming of "Abi" was an awful mistake, conveying wrong ideas about how psABIs worked and even more about what the enum meant. It was only meant to represent the way the value would be described to a codegen backend as it was lowered to that intermediate representation. It was never meant to mean anything about the actual psABI handling! The conflation is because LLVM typically will associate a certain form with a certain ABI, but even that does not hold when the special cases that actually exist arise, plus the IR annotations that modify the ABI. Reframe `rustc_abi::Abi` as the `BackendRepr` of the type, and rename `BackendRepr::Aggregate` as `BackendRepr::Memory`. Unfortunately, due to the persistent misunderstandings, this too is now incorrect: - Scattered ABI-relevant code is entangled with BackendRepr - We do not always pre-compute a correct BackendRepr that reflects how we "actually" want this value to be handled, so we leave the backend interface to also inject various special-cases here - In some cases `BackendRepr::Memory` is a "real" aggregate, but in others it is in fact using memory, and in some cases it is a scalar! Our rustc-to-backend lowering code handles this sort of thing right now. That will eventually be addressed by lifting duplicated lowering code to either rustc_codegen_ssa or rustc_target as appropriate.
2024-10-28compiler: Add `is_uninhabited` and use LayoutS accessorsJubilee Young-2/+2
This reduces the need of the compiler to peek on the fields of LayoutS.
2024-10-08compiler: Factor rustc_target::abi out of cg_llvmJubilee Young-1/+3
2024-10-04Use wide pointers consistenly across the compilerUrgau-1/+1
2024-09-22Reformat using the new identifier sorting from rustfmtMichael Goulet-26/+18
2024-09-21add `C-cmse-nonsecure-entry` ABIFolkert de Vries-3/+8
2024-09-18Update the minimum external LLVM to 18Josh Stone-3/+1
2024-09-17Rename `{ArgAbi,IntrinsicCall}Methods`.Nicholas Nethercote-1/+1
They both are part of `BuilderMethods`, and so should have `Builder` in their name like all the other traits in `BuilderMethods`.
2024-08-16Add `warn(unreachable_pub)` to `rustc_codegen_llvm`.Nicholas Nethercote-7/+7
2024-08-12Auto merge of #128371 - andjo403:rangeAttribute, r=nikicbors-13/+48
Add range attribute to scalar function results and arguments as LLVM 19 adds the range attribute this starts to use it for better optimization. hade been interesting to see a perf run with the https://github.com/rust-lang/rust/pull/127513 closes https://github.com/rust-lang/rust/issues/50156 cc https://github.com/rust-lang/rust/issues/49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
2024-08-11Add range attribute to scalar function results and argumentsAndreas Jonson-13/+48
2024-08-07codegen: better centralize function attribute computationRalf Jung-3/+23
2024-07-29Auto merge of #125016 - nicholasbishop:bishop-cb-112, r=tgross35bors-0/+2
Update compiler_builtins to 0.1.114 The `weak-intrinsics` feature was removed from compiler_builtins in https://github.com/rust-lang/compiler-builtins/pull/598, so dropped the `compiler-builtins-weak-intrinsics` feature from alloc/std/sysroot. In https://github.com/rust-lang/compiler-builtins/pull/593, some builtins for f16/f128 were added. These don't work for all compiler backends, so add a `compiler-builtins-no-f16-f128` feature and disable it for cranelift and gcc.
2024-07-29Reformat `use` declarations.Nicholas Nethercote-11/+9
The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.
2024-07-02Use the aligned size for alloca at args when the pass mode is cast.DianQK-1/+2
The `load` and `store` instructions in LLVM access the aligned size.
2024-05-30Add f16/f128 handling in a couple placesNicholas Bishop-0/+2
2024-05-09Make builtin_deref just return a TyMichael Goulet-1/+1
2024-04-25Auto merge of #121298 - nikic:writable, r=cuviperbors-0/+13
Set writable and dead_on_unwind attributes for sret arguments Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's. See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that: * The return slot is writable. * The return slot will not be read if the function unwinds. Fixes https://github.com/rust-lang/rust/issues/90595.
2024-04-25Set writable and dead_on_unwind attributes for sret argumentsNikita Popov-0/+13
2024-04-24Auto merge of #122053 - erikdesjardins:alloca, r=nikicbors-1/+1
Stop using LLVM struct types for alloca The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation. Split out from #121577. r? `@ghost`
2024-04-11use [N x i8] for alloca typesErik Desjardins-1/+1
2024-04-11Put `PlaceValue` into `OperandValue::Ref`, rather than 3 tuple fieldsScott McMurray-3/+8
2024-04-11Make `PlaceRef` hold a `PlaceValue` for the non-layout fields (like ↵Scott McMurray-1/+1
`OperandRef` does)
2024-04-08force_array -> is_consecutiveNikita Popov-1/+4
The actual ABI implication here is that in some cases the values are required to be "consecutive", i.e. must either all be passed in registers or all on stack (without padding). Adjust the code to either use Uniform::new() or Uniform::consecutive() depending on which behavior is needed. Then, when lowering this in LLVM, skip the [1 x i128] to i128 simplification if is_consecutive is set. i128 is the only case I'm aware of where this is problematic right now. If we find other cases, we can extend this (either based on target information or possibly just by not simplifying for is_consecutive entirely).
2024-04-08Fix argument ABI for overaligned structs on ppc64leNikita Popov-1/+1
When passing a 16 (or higher) aligned struct by value on ppc64le, it needs to be passed as an array of `i128` rather than an array of `i64`. This will force the use of an even starting register. For the case of a 16 byte struct with alignment 16 it is important that `[1 x i128]` is used instead of `i128` -- apparently, the latter will get treated similarly to `[2 x i64]`, not exhibiting the correct ABI. Add a `force_array` flag to `Uniform` to support this. The relevant clang code can be found here: https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L878-L884 https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L780-L784 I think the corresponding psABI wording is this: > Fixed size aggregates and unions passed by value are mapped to as > many doublewords of the parameter save area as the value uses in > memory. Aggregrates and unions are aligned according to their > alignment requirements. This may result in doublewords being > skipped for alignment. In particular the last sentence. Fixes https://github.com/rust-lang/rust/issues/122767.