rust - https://github.com/rust-lang/rust

Age	Commit message (Collapse)	Author	Lines
2025-08-19	Rollup merge of #140740 - ojeda:indirect-branch-cs-prefix, r=davidtwco	许杰友 Jieyou Xu (Joe)	-0/+9
	Add `-Zindirect-branch-cs-prefix` Cc: ``@azhogin`` ``@Darksonn`` This goes on top of https://github.com/rust-lang/rust/pull/135927, i.e. please skip the first commit here. Please feel free to inherit it there. In fact, I am not sure if there is any use case for the flag without `-Zretpoline*`. GCC and Clang allow it, though. There is a `FIXME` for two `ignore`s in the test that I took from another test I did in the past -- they may be needed or not here since I didn't run the full CI. Either way, it is not critical. Tracking issue: https://github.com/rust-lang/rust/issues/116852. MCP: https://github.com/rust-lang/compiler-team/issues/868.
2025-08-18	Rollup merge of #145309 - winstonallo:issue-145271-fix, r=tgross35	Stuart Cook	-0/+9
	Fix `-Zregparm` for LLVM builtins This fixes the issue where `-Zregparm=N` was not working correctly when calling LLVM intrinsics By default on `x86-32`, arguments are passed on the stack. The `-Zregparm=N` flag allows the first `N` arguments to be passed in registers instead. When calling intrinsics like `memset`, LLVM still passes parameters on the stack, which prevents optimizations like tail calls. As proposed by ````@tgross35,```` I fixed this by setting the `NumRegisterParameters` LLVM module flag to `N` when the `-Zregparm=N` is set. ```rust // compiler/rust_codegen_llvm/src/context.rs#375-382 if let Some(regparm_count) = sess.opts.unstable_opts.regparm { llvm::add_module_flag_u32( llmod, llvm::ModuleFlagMergeBehavior::Error, "NumRegisterParameters", regparm_count, ); } ``` [Here](https://rust.godbolt.org/z/YMezreo48) is a before/after compiler explorer. Here is the final result for the code snippet in the original issue: ```asm entrypoint: push esi mov esi, eax mov eax, ecx mov ecx, esi pop esi jmp memset ; Tail call parameters in registers ``` Fixes: https://github.com/rust-lang/rust/issues/145271
2025-08-17	Add -Zindirect-branch-cs-prefix (from draft PR)	Alice Ryhl	-0/+9

2025-08-14	Remove lto inline logic	Marcelo Domínguez	-10/+0

2025-08-14	Complete functionality and general cleanup	Marcelo Domínguez	-5/+0

2025-08-14	Basic implementation of `autodiff` intrinsic	Marcelo Domínguez	-1/+1

2025-08-13	Set NumRegisterParameters LLVM module flag to `N` when `-Zregparm=N` is	winstonallo	-0/+9
	set * Enforce the `-Zregparm=N` flag by setting the NumRegisterParameters LLVM module flag * Add assembly tests verifying that the parameters are passed in registers for reparm values 1, 2, and 3, for both LLVM intrinsics and non-builtin functions * Add c_void type to minicore
2025-08-12	[AVR] Changed data_layout	Tom Vijlbrief	-0/+6

2025-07-22	Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk	许杰友 Jieyou Xu (Joe)	-1/+17
	gpu offload host code generation r? ghost This will generate most of the host side code to use llvm's offload feature. The first PR will only handle automatic mem-transfers to and from the device. So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch. Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening. A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU. A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues. I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work. This work will also be compatible with std::autodiff, so one can differentiate GPU kernels. Tracking: - https://github.com/rust-lang/rust/issues/131513
2025-07-18	add various wrappers for gpu code generation	Manuel Drehwald	-1/+17

2025-07-18	Update AMDGPU data layout	Nikita Popov	-0/+5

2025-07-14	Eliminate all direct uses of LLVMMDStringInContext2	Oli Scherer	-2/+2

2025-07-14	Use context methods instead of directly calling FFI	Oli Scherer	-9/+5

2025-07-14	Merge `typeid_metadata` and `create_metadata`	Oli Scherer	-3/+4

2025-07-07	compiler: Parse `p-` specs in datalayout string, allow definition of custom ↵	Edoardo Marangoni	-1/+1
	default data address space
2025-06-17	Rollup merge of #142588 - ZuseZ4:generic-ctx-imprv, r=oli-obk	Jacob Pratt	-5/+2
	Generic ctx imprv Cleanup work for my gpu pr r? `@oli-obk`
2025-06-16	add and use generic get_const_int function	Manuel Drehwald	-5/+2

2025-06-15	Use `LLVMIntrinsicGetDeclaration` to completely remove the hardcoded ↵	sayantn	-177/+15
	intrinsics list
2025-06-12	Simplify implementation of Rust intrinsics by using type parameters in the cache	sayantn	-334/+128

2025-05-30	Auto merge of #139385 - joboet:threadlocal_address, r=nikic	bors	-0/+1
	rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS Fixes #136044 r? `@nikic`
2025-05-29	rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS	joboet	-0/+1

2025-05-28	Remove a couple of uses of interior mutability around statics	bjorn3	-5/+12

2025-05-28	Remove codegen_unit from MiscCodegenMethods	bjorn3	-4/+0

2025-05-10	Use the fallback body for `{minimum,maximum}f128` on LLVM as well.	Urgau	-2/+8

2025-05-09	Use intrinsics for `{f16,f32,f64,f128}::{minimum,maximum}` operations	Urgau	-0/+10

2025-04-28	remove noinline attribute and add alwaysinline after AD pass	bit-aloo	-0/+10

2025-04-05	Update the minimum external LLVM to 19	Josh Stone	-21/+2

2025-04-05	Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco	Matthias Krüger	-0/+16
	KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05	KCFI: Add KCFI arity indicator support	Ramon de C Valle	-0/+16
	Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05	Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk	Stuart Cook	-2/+21
	Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283
2025-04-04	add autodiff batching backend	Manuel Drehwald	-2/+21

2025-03-24	Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm	bors	-0/+12
	Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics. These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19. I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something. r? `@scottmcm`
2025-03-17	Remove implicit #[no_mangle] for #[rustc_std_internal_symbol]	bjorn3	-1/+2

2025-03-07	Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwco	Matthias Krüger	-67/+76
	Clean up various LLVM FFI things in codegen_llvm cc ```@ZuseZ4``` I touched some autodiff parts The major change of this PR is [bfd88ce](https://github.com/rust-lang/rust/pull/137549/commits/bfd88cead0dd79717f123ad7e9a26ecad88653cb) which makes `CodegenCx` generic just like `GenericBuilder` The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks. best reviewed commit-by-commit
2025-03-06	Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics	DaniPopes	-0/+12
	Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.
2025-02-24	Mark more LLVM FFI as safe	Oli Scherer	-6/+4

2025-02-24	Use a safe wrapper around an LLVM FFI function	Oli Scherer	-3/+8

2025-02-24	Remove inherent function that has a trait method duplicate of a commonly ↵	Oli Scherer	-9/+7
	imported trait
2025-02-24	Deduplicate more functions between `SimpleCx` and `CodegenCx`	Oli Scherer	-5/+0

2025-02-24	Generalize BaseTypeCodegenMethods	Oli Scherer	-9/+10

2025-02-24	Avoid some duplication between SimpleCx and CodegenCx	Oli Scherer	-44/+56

2025-02-24	codegen_llvm: avoid `Deref` impls w/ extern type	David Wood	-1/+1
	`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was or contained an extern type - in my experimental implementation of rust-lang/rfcs#3729, this isn't possible as the `Target` associated type's `?Sized` bound cannot be relaxed backwards compatibly (unless we come up with some way of doing this). In later pull requests with the rust-lang/rfcs#3729 implementation, breakage like this could only occur for nightly users relying on the `extern_types` feature. Upstreaming this to avoid needing to keep carrying this patch locally, and I think it'll necessarily need to change eventually.
2025-02-16	Rollup merge of #136545 - durin42:nvptx64-align, r=nikic	Jacob Pratt	-0/+6
	nvptx64: update default alignment to match LLVM 21 This changed in llvm/llvm-project@91cb8f5d3202870602c6bef807bc4c7ae8a32790. The commit itself is mostly about some intrinsic instructions, but as an aside it also mentions something about addrspace for tensor memory, which I believe is what this string is telling us. `@rustbot` label: +llvm-main
2025-02-11	Document some safety constraints and use more safe wrappers	Oli Scherer	-6/+4

2025-02-04	nvptx64: update default alignment to match LLVM 21	Augie Fackler	-0/+6
	This changed in llvm/llvm-project@91cb8f5d3202870602c6bef807bc4c7ae8a32790. The commit itself is mostly about some intrinsic instructions, but as an aside it also mentions something about addrspace for tensor memory, which I believe is what this string is telling us. @rustbot label: +llvm-main
2025-01-30	Use ExistentialTraitRef throughout codegen	Michael Goulet	-5/+3

2025-01-24	Rollup merge of #135581 - EnzymeAD:refactor-codgencx, r=oli-obk	Matthias Krüger	-5/+52
	Separate Builder methods from tcx As part of the autodiff upstreaming we noticed, that it would be nice to have various builder methods available without the TypeContext, which prevents the normal CodegenCx to be passed around between threads. We introduce a SimpleCx which just owns the llvm module and llvm context, to encapsulate them. The previous CodegenCx now implements deref and forwards access to the llvm module or context to it's SimpleCx sub-struct. This gives us a bit more flexibility, because now we can pass (or construct) the SimpleCx in locations where we don't have enough information to construct a CodegenCx, or are not able to pass it around due to the tcx lifetimes (and it not implementing send/sync). This also introduces an SBuilder, similar to the SimpleCx. The SBuilder uses a SimpleCx, whereas the existing Builder uses the larger CodegenCx. I will push updates to make implementations generic (where possible) to be implemented once and work for either of the two. I'll also clean up the leftover code. `call` is a bit tricky, because it requires a tcx, I probably need to duplicate it after all. Tracking: - https://github.com/rust-lang/rust/issues/124509
2025-01-24	Make CodegenCx and Builder generic	Manuel Drehwald	-5/+52
	Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
2025-01-21	Auto merge of #134299 - RalfJung:remove-start, r=compiler-errors	bors	-1/+0
	remove support for the (unstable) #[start] attribute As explained by `@Noratrieb:` `#[start]` should be deleted. It's nothing but an accidentally leaked implementation detail that's a not very useful mix between "portable" entrypoint logic and bad abstraction. I think the way the stable user-facing entrypoint should work (and works today on stable) is pretty simple: - `std`-using cross-platform programs should use `fn main()`. the compiler, together with `std`, will then ensure that code ends up at `main` (by having a platform-specific entrypoint that gets directed through `lang_start` in `std` to `main` - but that's just an implementation detail) - `no_std` platform-specific programs should use `#![no_main]` and define their own platform-specific entrypoint symbol with `#[no_mangle]`, like `main`, `_start`, `WinMain` or `my_embedded_platform_wants_to_start_here`. most of them only support a single platform anyways, and need cfg for the different platform's ways of passing arguments or other things anyways `#[start]` is in a super weird position of being neither of those two. It tries to pretend that it's cross-platform, but its signature is a total lie. Those arguments are just stubbed out to zero on ~~Windows~~ wasm, for example. It also only handles the platform-specific entrypoints for a few platforms that are supported by `std`, like Windows or Unix-likes. `my_embedded_platform_wants_to_start_here` can't use it, and neither could a libc-less Linux program. So we have an attribute that only works in some cases anyways, that has a signature that's a total lie (and a signature that, as I might want to add, has changed recently, and that I definitely would not be comfortable giving any stability guarantees on), and where there's a pretty easy way to get things working without it in the first place. Note that this feature has not been RFCed in the first place. This comment was posted [in May](https://github.com/rust-lang/rust/issues/29633#issuecomment-2088596042) and so far nobody spoke up in that issue with a usecase that would require keeping the attribute. Closes https://github.com/rust-lang/rust/issues/29633 try-job: x86_64-gnu-nopt try-job: x86_64-msvc-1 try-job: x86_64-msvc-2 try-job: test-various
2025-01-21	remove support for the #[start] attribute	Ralf Jung	-1/+0