rust - https://github.com/rust-lang/rust

Age	Commit message (Collapse)	Author	Lines
2025-09-28	Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4	Matthias Krüger	-0/+2
	TypeTree support in autodiff # TypeTrees for Autodiff ## What are TypeTrees? Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently. ## Structure ```rust TypeTree(Vec<Type>) Type { offset: isize, // byte offset (-1 = everywhere) size: usize, // size in bytes kind: Kind, // Float, Integer, Pointer, etc. child: TypeTree // nested structure } ``` ## Example: `fn compute(x: &f32, data: &[f32]) -> f32` Input 0: `x: &f32` ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) }]) ``` Input 1: `data: &[f32]` ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, // -1 = all elements child: TypeTree::new() }]) }]) ``` Output: `f32` ```rust TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) ``` ## Why Needed? - Enzyme can't deduce complex type layouts from LLVM IR - Prevents slow memory pattern analysis - Enables correct derivative computation for nested structures - Tells Enzyme which bytes are differentiable vs metadata ## What Enzyme Does With This Information: Without TypeTrees (current state): ```llvm ; Enzyme sees generic LLVM IR: define float ``@distance(ptr`` %p1, ptr %p2) { ; Has to guess what these pointers point to ; Slow analysis of all memory operations ; May miss optimization opportunities } ``` With TypeTrees (our implementation): ```llvm define "enzyme_type"="{[]:Float@float}" float ``@distance(`` ptr "enzyme_type"="{[]:Pointer}" %p1, ptr "enzyme_type"="{[]:Pointer}" %p2 ) { ; Enzyme knows exact type layout ; Can generate efficient derivative code directly } ``` # TypeTrees - Offset and -1 Explained ## Type Structure ```rust Type { offset: isize, // WHERE this type starts size: usize, // HOW BIG this type is kind: Kind, // WHAT KIND of data (Float, Int, Pointer) child: TypeTree // WHAT'S INSIDE (for pointers/containers) } ``` ## Offset Values ### Regular Offset (0, 4, 8, etc.) Specific byte position within a structure ```rust struct Point { x: f32, // offset 0, size 4 y: f32, // offset 4, size 4 id: i32, // offset 8, size 4 } ``` TypeTree for `&Point` (internal representation): ```rust TypeTree(vec![ Type { offset: 0, size: 4, kind: Float }, // x at byte 0 Type { offset: 4, size: 4, kind: Float }, // y at byte 4 Type { offset: 8, size: 4, kind: Integer } // id at byte 8 ]) ``` Generates LLVM: ```llvm "enzyme_type"="{[]:Float@float}" ``` ### Offset -1 (Special: "Everywhere") Means "this pattern repeats for ALL elements" #### Example 1: Array `[f32; 100]` ```rust TypeTree(vec![Type { offset: -1, // ALL positions size: 4, // each f32 is 4 bytes kind: Float, // every element is float }]) ``` Instead of listing 100 separate Types with offsets `0,4,8,12...396` #### Example 2: Slice `&[i32]` ```rust // Pointer to slice data TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, // ALL slice elements size: 4, // each i32 is 4 bytes kind: Integer }]) }]) ``` #### Example 3: Mixed Structure ```rust struct Container { header: i64, // offset 0 data: [f32; 1000], // offset 8, but elements use -1 } ``` ```rust TypeTree(vec![ Type { offset: 0, size: 8, kind: Integer }, // header Type { offset: 8, size: 4000, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float // ALL array elements }]) } ]) ```
2025-09-25	Use `LLVMDisposeTargetMachine`	Zalathar	-3/+1

2025-09-21	Add self-profile events for target-machine creation	Zalathar	-0/+10
	These code paths are surprisingly hot in the `large-workspace` benchmark; it would be handy to see some detailed timings and execution counts.
2025-09-19	autodiff: Add basic TypeTree with NoTT flag	Karan Janthe	-0/+2
	Signed-off-by: Karan Janthe <karanjanthe@gmail.com>
2025-09-18	Move target machine command-line quoting from C++ to Rust	Zalathar	-26/+81

2025-09-15	Make llvm_enzyme a regular cargo feature	bjorn3	-3/+5
	This makes it clearer that it is set by the build system rather than by the rustc that compiles the current rustc. It also avoids bootstrap needing to pass --check-cfg llvm_enzyme to rustc.
2025-09-06	Remove want_summary argument from prepare_thin	bjorn3	-8/+5
	It is always false nowadays. ThinLTO summary writing is instead done by llvm_optimize.
2025-09-06	Remove thin_link_data method from ThinBufferMethods	bjorn3	-8/+8
	It is only used within cg_llvm.
2025-09-06	Ensure fat LTO doesn't merge everything into the allocator module	bjorn3	-1/+7

2025-09-04	Special case allocator module submission to avoid special casing it elsewhere	bjorn3	-15/+7
	A lot of places had special handling just in case they would get an allocator module even though most of these places could never get one or would have a trivial implementation for the allocator module. Moving all handling of the allocator module to a single place simplifies things a fair bit.
2025-08-29	Update to ar_archive_writer 0.5.1	Daniel Paoliello	-0/+5

2025-08-28	Move ___asan_globals_registered export	bjorn3	-0/+4
	All other sanitizer symbols are handled in prepare_lto already.
2025-08-28	Only export the sanitizer symbols for LTO and move export code to cg_llvm	bjorn3	-0/+28
	Don't export them from cdylibs. There is no need to do so and it complicates exported_non_generic_symbols. In addition the GCC backend likely uses different symbols and may potentially not even need us to explicitly tell it to export the symbols it needs.
2025-08-26	Rollup merge of #145814 - bjorn3:codegen_worker_fatal_error, r=petrochenkov	Stuart Cook	-51/+50
	Handle unwinding fatal errors in codegen workers Also directly unwind on fatal errors at the point they are emitted inside the codegen backends. Fixes the coordinator ICE of https://github.com/rust-lang/rust/issues/132240, https://github.com/rust-lang/rust/issues/135075 and https://github.com/rust-lang/rust/issues/145800.
2025-08-24	Rename `llvm::Bool` aliases to standard const case	Zalathar	-1/+1
	This avoids the need for `#![allow(non_upper_case_globals)]`.
2025-08-24	Directly raise fatal errors inside the codegen backends	bjorn3	-51/+50
	As opposed to passing it around through Result.
2025-08-19	Rollup merge of #145484 - Zalathar:archive-builder, r=bjorn3	Stuart Cook	-177/+6
	Remove `LlvmArchiveBuilder` and supporting code/bindings Switching over to the newer Rust-based `ArArchiveBuilder` happened in rust-lang/rust#128936, a year ago. Per the comment in `new_archive_builder`, that seems like enough time to justify removing the older, unused `LlvmArchiveBuilder` implementation and its associated bindings. Fixes rust-lang/rust#128955.
2025-08-19	Rollup merge of #145432 - Zalathar:target-machine, r=wesleywiser	Stuart Cook	-8/+12
	cg_llvm: Small cleanups to `owned_target_machine` This PR contains a few tiny cleanups to the `owned_target_machine` code. Each individual commit should be fairly straightforward.
2025-08-16	Remove `LlvmArchiveBuilder` and supporting code/bindings	Zalathar	-177/+6

2025-08-15	Simplify the `args_cstr_buff` assertion	Zalathar	-5/+4

2025-08-15	Avoid an unnecessary intermediate `&mut` reference	Zalathar	-1/+1
	The `NonNull::as_mut` method returns a mut reference, rather than the mut pointer that is intended here.
2025-08-15	Avoid an explicit cast from `const c_uchar` to `const c_char`	Zalathar	-2/+2
	As noted in the `ffi` module docs, passing pointer/length byte strings from Rust to C++ is easier if we declare them as `const c_uchar` on the Rust side, but `const char ` (possibly signed) on the C++ side. This is allowed because both pointer types are ABI-compatible, regardless of char signedness.
2025-08-15	Declare module `rustc_codegen_llvm::back` in the normal way	Zalathar	-0/+5
	Declaring these submodules directly in `lib.rs` was needlessly confusing.
2025-08-15	Rollup merge of #145004 - bjorn3:remove_unused_fields, r=WaffleLapkin	Stuart Cook	-5/+6
	Couple of minor cleanups
2025-08-14	Remove lto inline logic	Marcelo Domínguez	-27/+1

2025-08-08	Remove bitcode_llvm_cmdline	bjorn3	-5/+6
	It used to be necessary on Apple platforms to ship with the App Store, but XCode 15 has stopped embedding LLVM bitcode and the App Store no longer accepts apps with bitcode embedded.
2025-07-28	Auto merge of #144562 - matthiaskrgr:rollup-mlvn7qo, r=matthiaskrgr	bors	-46/+8
	Rollup of 7 pull requests Successful merges: - rust-lang/rust#144072 (update `Atomic::from_ptr` and `Atomic::as_ptr` docs) - rust-lang/rust#144151 (`tests/ui/issues/`: The Issues Strike Back [1/N]) - rust-lang/rust#144300 (Clippy fixes for miropt-test-tools) - rust-lang/rust#144399 (Add a ratchet for moving all standard library tests to separate packages) - rust-lang/rust#144472 (str: Mark unstable `round_char_boundary` feature functions as const) - rust-lang/rust#144503 (Various refactors to the codegen coordinator code (part 3)) - rust-lang/rust#144530 (coverage: Infer `instances_used` from `pgo_func_name_var_map`) r? `@ghost` `@rustbot` modify labels: rollup
2025-07-28	use let chains in ast, borrowck, codegen, const_eval	Kivooeo	-4/+4

2025-07-26	Remove support for -Zcombine-cgu	bjorn3	-23/+0
	Nobody seems to actually use this, while still adding some extra complexity to the already rather complex codegen coordinator code. It is also not supported by any backend other than the LLVM backend.
2025-07-25	Use the object crate rather than LLVM for extracting bitcode sections	bjorn3	-23/+8

2025-07-24	Auto merge of #144062 - bjorn3:lto_refactors2, r=davidtwco	bors	-91/+23
	Various refactors to the LTO handling code (part 2) Continuing from https://github.com/rust-lang/rust/pull/143388 this removes a bit of dead code and moves the LTO symbol export calculation from individual backends to cg_ssa.
2025-07-22	Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk	许杰友 Jieyou Xu (Joe)	-0/+7
	gpu offload host code generation r? ghost This will generate most of the host side code to use llvm's offload feature. The first PR will only handle automatic mem-transfers to and from the device. So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch. Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening. A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU. A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues. I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work. This work will also be compatible with std::autodiff, so one can differentiate GPU kernels. Tracking: - https://github.com/rust-lang/rust/issues/131513
2025-07-21	Remove each_linked_rlib_for_lto from CodegenContext	bjorn3	-4/+7

2025-07-21	Move exported_symbols_for_lto out of CodegenContext	bjorn3	-4/+8

2025-07-21	Merge exported_symbols computation into exported_symbols_for_lto	bjorn3	-6/+5
	And move exported_symbols_for_lto call from backends to cg_ssa.
2025-07-21	Move LTO symbol export calculation from backends to cg_ssa	bjorn3	-77/+14

2025-07-21	Merge modules and cached_modules for fat LTO	bjorn3	-12/+1
	The modules vec can already contain serialized modules and there is no need to distinguish between cached and non-cached cgus at LTO time.
2025-07-18	gpu host code generation	Manuel Drehwald	-0/+1

2025-07-18	add -Zoffload=Enable flag behind -Zunstable-options, to enable gpu (host) ↵	Manuel Drehwald	-0/+6
	code generation
2025-07-18	Pass wasm exception model to TargetOptions	Nikita Popov	-0/+6
	This is no longer implied by -wasm-enable-eh.
2025-07-17	Rollup merge of #143388 - bjorn3:lto_refactors, r=compiler-errors	León Orell Valerian Liehr	-11/+10
	Various refactors to the LTO handling code In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
2025-07-14	Avoid a bunch of unnecessary `unsafe` blocks in cg_llvm	Oli Scherer	-41/+36

2025-07-11	Rollup merge of #143633 - dillona:noinline-assert, r=fee1-dead	Matthias Krüger	-1/+1
	fix: correct assertion to check for 'noinline' attribute presence before removal
2025-07-10	Make some "safe" llvm ops actually sound	Oli Scherer	-1/+1

2025-07-08	fix: correct assertion to check for 'noinline' attribute presence before removal	Dillon Amburgey	-1/+1

2025-07-03	Move dcx creation into WriteBackendMethods::codegen	bjorn3	-1/+3

2025-07-03	Remove LtoModuleCodegen	bjorn3	-10/+7
	Most uses of it either contain a fat or thin lto module. Only WorkItem::LTO could contain both, but splitting that enum variant doesn't complicate things much.
2025-06-25	added PrintTAFn flag for autodiff	Karan Janthe	-1/+5
	Signed-off-by: Karan Janthe <karanjanthe@gmail.com>
2025-05-28	Mark all optimize methods and the codegen method as safe	bjorn3	-3/+3
	There is no safety contract and I don't think any of them can actually cause UB in more ways than passing malicious source code to rustc can. While LtoModuleCodegen::optimize says that the returned ModuleCodegen points into the LTO module, the LTO module has already been dropped by the time this function returns, so if the returned ModuleCodegen indeed points into the LTO module, we would have seen crashes on every LTO compilation, which we don't. As such the comment is outdated.
2025-05-11	Add a safe wrapper for `LLVMAppendModuleInlineAsm`	Zalathar	-2/+2
	This patch also changes the Rust-side declaration to take `const c_uchar` instead of `const c_char`, to avoid the need for `AsCCharPtr`.