rust - https://github.com/rust-lang/rust

Age	Commit message (Collapse)	Author	Lines
2025-09-28	Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4	Matthias Krüger	-0/+1
	TypeTree support in autodiff # TypeTrees for Autodiff ## What are TypeTrees? Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently. ## Structure ```rust TypeTree(Vec<Type>) Type { offset: isize, // byte offset (-1 = everywhere) size: usize, // size in bytes kind: Kind, // Float, Integer, Pointer, etc. child: TypeTree // nested structure } ``` ## Example: `fn compute(x: &f32, data: &[f32]) -> f32` Input 0: `x: &f32` ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) }]) ``` Input 1: `data: &[f32]` ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, // -1 = all elements child: TypeTree::new() }]) }]) ``` Output: `f32` ```rust TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) ``` ## Why Needed? - Enzyme can't deduce complex type layouts from LLVM IR - Prevents slow memory pattern analysis - Enables correct derivative computation for nested structures - Tells Enzyme which bytes are differentiable vs metadata ## What Enzyme Does With This Information: Without TypeTrees (current state): ```llvm ; Enzyme sees generic LLVM IR: define float ``@distance(ptr`` %p1, ptr %p2) { ; Has to guess what these pointers point to ; Slow analysis of all memory operations ; May miss optimization opportunities } ``` With TypeTrees (our implementation): ```llvm define "enzyme_type"="{[]:Float@float}" float ``@distance(`` ptr "enzyme_type"="{[]:Pointer}" %p1, ptr "enzyme_type"="{[]:Pointer}" %p2 ) { ; Enzyme knows exact type layout ; Can generate efficient derivative code directly } ``` # TypeTrees - Offset and -1 Explained ## Type Structure ```rust Type { offset: isize, // WHERE this type starts size: usize, // HOW BIG this type is kind: Kind, // WHAT KIND of data (Float, Int, Pointer) child: TypeTree // WHAT'S INSIDE (for pointers/containers) } ``` ## Offset Values ### Regular Offset (0, 4, 8, etc.) Specific byte position within a structure ```rust struct Point { x: f32, // offset 0, size 4 y: f32, // offset 4, size 4 id: i32, // offset 8, size 4 } ``` TypeTree for `&Point` (internal representation): ```rust TypeTree(vec![ Type { offset: 0, size: 4, kind: Float }, // x at byte 0 Type { offset: 4, size: 4, kind: Float }, // y at byte 4 Type { offset: 8, size: 4, kind: Integer } // id at byte 8 ]) ``` Generates LLVM: ```llvm "enzyme_type"="{[]:Float@float}" ``` ### Offset -1 (Special: "Everywhere") Means "this pattern repeats for ALL elements" #### Example 1: Array `[f32; 100]` ```rust TypeTree(vec![Type { offset: -1, // ALL positions size: 4, // each f32 is 4 bytes kind: Float, // every element is float }]) ``` Instead of listing 100 separate Types with offsets `0,4,8,12...396` #### Example 2: Slice `&[i32]` ```rust // Pointer to slice data TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, // ALL slice elements size: 4, // each i32 is 4 bytes kind: Integer }]) }]) ``` #### Example 3: Mixed Structure ```rust struct Container { header: i64, // offset 0 data: [f32; 1000], // offset 8, but elements use -1 } ``` ```rust TypeTree(vec![ Type { offset: 0, size: 8, kind: Integer }, // header Type { offset: 8, size: 4000, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float // ALL array elements }]) } ]) ```
2025-09-25	Use standard attribute logic for allocator shim	Nikita Popov	-1/+7
	Use llfn_attrs_from_instance() to generate the attributes for the allocator shim. This ensures that we generate all the usual attributes (and don't get to find out one-by-one that a certain attribute is important for a certain target). Additionally this will enable emitting the allocator-specific attributes (not included here). This change is quite awkward because the allocator shim uses SimpleCx, while llfn_attrs_from_instance uses CodegenCx. I've switched it to use SimpleCx plus tcx/sess arguments where necessary. If there's a simpler way to do this, I'd love to know about it...
2025-09-19	added typetree support for memcpy	Karan Janthe	-0/+1

2025-09-12	Remove unreachable unsized arg handling in `store_fn_arg/store_arg` in codegen	Zachary S	-8/+3

2025-08-26	Use captures(address) instead of captures(none) for indirect args	Nikita Popov	-3/+5
	While provenance cannot be captured through these arguments, the address / object identity can.
2025-08-20	Tell LLVM about read-only captures	Nikita Popov	-1/+6
	`&Freeze` parameters are not only `readonly` within the function, but any captures of the pointer can also only be used for reads. This can now be encoded using the `captures(address, read_provenance)` attribute.
2025-08-11	Set dead_on_return attribute for indirect arguments	Nikita Popov	-1/+11
	Set the dead_on_return attribute (added in LLVM 21) for arguments that are passed indirectly, but not byval. This indicates that the value of the argument on return does not matter, enabling additional dead store elimination.
2025-07-05	use `is_multiple_of` instead of manual modulo	Folkert de Vries	-1/+1

2025-06-16	Fix RISC-V C function ABI when passing/returning structs containing floats	beetrees	-1/+1

2025-06-12	add `extern "custom"` functions	Folkert de Vries	-0/+4

2025-06-03	Rollup merge of #141569 - workingjubilee:canonicalize-abi, r=bjorn3	Matthias Krüger	-29/+40
	Replace ad-hoc ABI "adjustments" with an `AbiMap` to `CanonAbi` Our `conv_from_spec_abi`, `adjust_abi`, and `is_abi_supported` combine to give us a very confusing way of reasoning about what _actual_ calling convention we want to lower our code to and whether we want to compile the resulting code at all. Instead of leaving this code as a miniature adventure game in which someone tries to combine stateful mutations into a Rube Goldberg machine that will let them escape the maze and arrive at the promised land of codegen, we let `AbiMap` devour this complexity. Once you have an `AbiMap`, you can answer which `ExternAbi`s will lower to what `CanonAbi`s (and whether they will lower at all). Removed: - `conv_from_spec_abi` replaced by `AbiMap::canonize_abi` - `adjust_abi` replaced by same - `Conv::PreserveAll` as unused - `Conv::Cold` as unused - `enum Conv` replaced by `enum CanonAbi` target-spec.json changes: - If you have a target-spec.json then now your "entry-abi" key will be specified in terms of one of the `"{abi}"` strings Rust recognizes, e.g. ```json "entry-abi": "C", "entry-abi": "win64", "entry-abi": "aapcs", ```
2025-06-03	cg_llvm: convert to CanonAbi	Jubilee Young	-29/+40

2025-05-28	Remove unused arg_memory_ty method	bjorn3	-10/+0

2025-04-05	Update the minimum external LLVM to 19	Josh Stone	-16/+1

2025-02-24	Remove an unused lifetime param	Oli Scherer	-1/+1

2025-02-18	compiler: Stop reexporting stuff in cg_llvm::abi	Jubilee Young	-11/+8
	The reexports confuse tooling like rustdoc into thinking cg_llvm is the source of key types that originate in rustc_target.
2025-02-12	Rollup merge of #136807 - ↵	Jacob Pratt	-1/+0
	workingjubilee:merge-gpus-to-get-the-arcradeongeforce, r=bjorn3 compiler: internally merge `PtxKernel` into `GpuKernel` r? ``@bjorn3`` for review
2025-02-11	Rollup merge of #136721 - dpaoliello:cleanllvm2, r=Zalathar	Jacob Pratt	-1/+1
	cg_llvm: Reduce visibility of some items outside the `llvm` module Next piece of #135502 This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.
2025-02-10	rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm module	Daniel Paoliello	-1/+1

2025-02-09	compiler: internally merge `Conv::PtxKernel` into `GpuKernel`	Jubilee Young	-1/+0
	It is speculated that these two can be conceptually merged, and it can start by ripping out rustc's notion of the PtxKernel call convention. Leave the ExternAbi for now, but the nvptx target now should see it as just a different way to spell Conv::GpuKernel.
2025-02-09	Auto merge of #136751 - bjorn3:update_rustfmt, r=Mark-Simulacrum	bors	-15/+23
	Update bootstrap compiler and rustfmt The rustfmt version we previously used formats things differently from what the latest nightly rustfmt does. This causes issues for subtrees that get formatted both in-tree and in their own repo. Updating the rustfmt used in-tree solves those issues. Also bumped the bootstrap compiler as the stage0 update command always updates both at the same time.
2025-02-08	Rustfmt	bjorn3	-15/+23

2025-02-07	compiler: remove reexports from rustc_target::callconv	Jubilee Young	-4/+3

2025-01-16	Add gpu-kernel calling convention	Flakebi	-6/+16
	The amdgpu-kernel calling convention was reverted in commit f6b21e90d1ec01081bc2619efb68af6788a63d65 due to inactivity in the amdgpu target. Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.
2024-11-03	compiler: Directly use rustc_abi in codegen	Jubilee Young	-3/+3

2024-10-29	compiler: `rustc_abi::Abi` => `BackendRepr`	Jubilee Young	-4/+6
	The initial naming of "Abi" was an awful mistake, conveying wrong ideas about how psABIs worked and even more about what the enum meant. It was only meant to represent the way the value would be described to a codegen backend as it was lowered to that intermediate representation. It was never meant to mean anything about the actual psABI handling! The conflation is because LLVM typically will associate a certain form with a certain ABI, but even that does not hold when the special cases that actually exist arise, plus the IR annotations that modify the ABI. Reframe `rustc_abi::Abi` as the `BackendRepr` of the type, and rename `BackendRepr::Aggregate` as `BackendRepr::Memory`. Unfortunately, due to the persistent misunderstandings, this too is now incorrect: - Scattered ABI-relevant code is entangled with BackendRepr - We do not always pre-compute a correct BackendRepr that reflects how we "actually" want this value to be handled, so we leave the backend interface to also inject various special-cases here - In some cases `BackendRepr::Memory` is a "real" aggregate, but in others it is in fact using memory, and in some cases it is a scalar! Our rustc-to-backend lowering code handles this sort of thing right now. That will eventually be addressed by lifting duplicated lowering code to either rustc_codegen_ssa or rustc_target as appropriate.
2024-10-28	compiler: Add `is_uninhabited` and use LayoutS accessors	Jubilee Young	-2/+2
	This reduces the need of the compiler to peek on the fields of LayoutS.
2024-10-08	compiler: Factor rustc_target::abi out of cg_llvm	Jubilee Young	-1/+3

2024-10-04	Use wide pointers consistenly across the compiler	Urgau	-1/+1

2024-09-22	Reformat using the new identifier sorting from rustfmt	Michael Goulet	-26/+18

2024-09-21	add `C-cmse-nonsecure-entry` ABI	Folkert de Vries	-3/+8

2024-09-18	Update the minimum external LLVM to 18	Josh Stone	-3/+1

2024-09-17	Rename `{ArgAbi,IntrinsicCall}Methods`.	Nicholas Nethercote	-1/+1
	They both are part of `BuilderMethods`, and so should have `Builder` in their name like all the other traits in `BuilderMethods`.
2024-08-16	Add `warn(unreachable_pub)` to `rustc_codegen_llvm`.	Nicholas Nethercote	-7/+7

2024-08-12	Auto merge of #128371 - andjo403:rangeAttribute, r=nikic	bors	-13/+48
	Add range attribute to scalar function results and arguments as LLVM 19 adds the range attribute this starts to use it for better optimization. hade been interesting to see a perf run with the https://github.com/rust-lang/rust/pull/127513 closes https://github.com/rust-lang/rust/issues/50156 cc https://github.com/rust-lang/rust/issues/49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
2024-08-11	Add range attribute to scalar function results and arguments	Andreas Jonson	-13/+48

2024-08-07	codegen: better centralize function attribute computation	Ralf Jung	-3/+23

2024-07-29	Auto merge of #125016 - nicholasbishop:bishop-cb-112, r=tgross35	bors	-0/+2
	Update compiler_builtins to 0.1.114 The `weak-intrinsics` feature was removed from compiler_builtins in https://github.com/rust-lang/compiler-builtins/pull/598, so dropped the `compiler-builtins-weak-intrinsics` feature from alloc/std/sysroot. In https://github.com/rust-lang/compiler-builtins/pull/593, some builtins for f16/f128 were added. These don't work for all compiler backends, so add a `compiler-builtins-no-f16-f128` feature and disable it for cranelift and gcc.
2024-07-29	Reformat `use` declarations.	Nicholas Nethercote	-11/+9
	The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.
2024-07-02	Use the aligned size for alloca at args when the pass mode is cast.	DianQK	-1/+2
	The `load` and `store` instructions in LLVM access the aligned size.
2024-05-30	Add f16/f128 handling in a couple places	Nicholas Bishop	-0/+2

2024-05-09	Make builtin_deref just return a Ty	Michael Goulet	-1/+1

2024-04-25	Auto merge of #121298 - nikic:writable, r=cuviper	bors	-0/+13
	Set writable and dead_on_unwind attributes for sret arguments Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's. See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that: * The return slot is writable. * The return slot will not be read if the function unwinds. Fixes https://github.com/rust-lang/rust/issues/90595.
2024-04-25	Set writable and dead_on_unwind attributes for sret arguments	Nikita Popov	-0/+13

2024-04-24	Auto merge of #122053 - erikdesjardins:alloca, r=nikic	bors	-1/+1
	Stop using LLVM struct types for alloca The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation. Split out from #121577. r? `@ghost`
2024-04-11	use [N x i8] for alloca types	Erik Desjardins	-1/+1

2024-04-11	Put `PlaceValue` into `OperandValue::Ref`, rather than 3 tuple fields	Scott McMurray	-3/+8

2024-04-11	Make `PlaceRef` hold a `PlaceValue` for the non-layout fields (like ↵	Scott McMurray	-1/+1
	`OperandRef` does)
2024-04-08	force_array -> is_consecutive	Nikita Popov	-1/+4
	The actual ABI implication here is that in some cases the values are required to be "consecutive", i.e. must either all be passed in registers or all on stack (without padding). Adjust the code to either use Uniform::new() or Uniform::consecutive() depending on which behavior is needed. Then, when lowering this in LLVM, skip the [1 x i128] to i128 simplification if is_consecutive is set. i128 is the only case I'm aware of where this is problematic right now. If we find other cases, we can extend this (either based on target information or possibly just by not simplifying for is_consecutive entirely).
2024-04-08	Fix argument ABI for overaligned structs on ppc64le	Nikita Popov	-1/+1
	When passing a 16 (or higher) aligned struct by value on ppc64le, it needs to be passed as an array of `i128` rather than an array of `i64`. This will force the use of an even starting register. For the case of a 16 byte struct with alignment 16 it is important that `[1 x i128]` is used instead of `i128` -- apparently, the latter will get treated similarly to `[2 x i64]`, not exhibiting the correct ABI. Add a `force_array` flag to `Uniform` to support this. The relevant clang code can be found here: https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L878-L884 https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L780-L784 I think the corresponding psABI wording is this: > Fixed size aggregates and unions passed by value are mapped to as > many doublewords of the parameter save area as the value uses in > memory. Aggregrates and unions are aligned according to their > alignment requirements. This may result in doublewords being > skipped for alignment. In particular the last sentence. Fixes https://github.com/rust-lang/rust/issues/122767.