about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src/builder.rs
AgeCommit message (Collapse)AuthorLines
2025-07-31Rollup merge of #144232 - xacrimon:explicit-tail-call, r=WaffleLapkinStuart Cook-1/+24
Implement support for `become` and explicit tail call codegen for the LLVM backend This PR implements codegen of explicit tail calls via `become` in `rustc_codegen_ssa` and support within the LLVM backend. Completes a task on (https://github.com/rust-lang/rust/issues/112788). This PR implements all the necessary bits to make explicit tail calls usable, other backends have received stubs for now and will ICE if you use `become` on them. I suspect there is some bikeshedding to be done on how we should go about implementing this for other backends, but it should be relatively straightforward for GCC after this is merged. During development I also put together a POC bytecode VM based on tail call dispatch to test these changes out and analyze the codegen to make sure it generates expected assembly. That is available [here](https://github.com/xacrimon/tcvm).
2025-07-28use let chains in ast, borrowck, codegen, const_evalKivooeo-4/+4
2025-07-26Implement support for explicit tail calls in the MIR block builders and the ↵Joel Wejdenstål-1/+24
LLVM codegen backend.
2025-07-18add various wrappers for gpu code generationManuel Drehwald-0/+69
2025-07-14Eliminate all direct uses of LLVMMDStringInContext2Oli Scherer-3/+2
2025-07-14Use context methods instead of directly calling FFIOli Scherer-3/+1
2025-07-14Merge `typeid_metadata` and `create_metadata`Oli Scherer-1/+1
2025-07-14Shrink some `unsafe` blocks in cg_llvmOli Scherer-7/+6
2025-07-07Remove support for dynamic allocasmejrs-10/+0
2025-06-22Remove dead instructions in terminate blocksMark Rousskov-2/+1
2025-06-15Correctly account for different address spaces in LLVM intrinsic invocationssayantn-2/+2
2025-06-15Use `LLVMIntrinsicGetDeclaration` to completely remove the hardcoded ↵sayantn-6/+7
intrinsics list
2025-06-12Simplify implementation of Rust intrinsics by using type parameters in the cachesayantn-132/+51
2025-06-03Remove type_test from IntrinsicCallBuilderMethodsbjorn3-2/+5
It is only used within cg_llvm.
2025-05-30Auto merge of #139385 - joboet:threadlocal_address, r=nikicbors-3/+9
rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS Fixes #136044 r? `@nikic`
2025-05-29rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLSjoboet-3/+9
2025-05-28get rid of rustc_codegen_ssa::common::AtomicOrderingRalf Jung-6/+6
2025-05-11Rename `OperandBundleOwned` to `OperandBundleBox`Zalathar-5/+5
As with `DIBuilderBox`, the "Box" suffix does a better job of communicating that this is an owning pointer to some borrowable resource. This also renames the `raw` method to `as_ref`, which is what it would have been named originally if the `Deref` problem had been known at the time.
2025-04-24Rollup merge of #139261 - RalfJung:msvc-align-mitigation, r=oli-obkMatthias Krüger-0/+2
mitigate MSVC alignment issue on x86-32 This implements mitigation for https://github.com/rust-lang/rust/issues/112480 by stopping to emit `align` attributes on loads and function arguments when building for a win32 MSVC target. MSVC is known to not properly align `u64` and similar types, and claiming to LLVM that everything is properly aligned increases the chance that this will cause problems. Of course, the misalignment is still a bug, but we can't fix that bug, only MSVC can. Also add an errata note to the platform support page warning users about this known problem. try-job: `i686-msvc*`
2025-04-16working dupv and dupvonly for fwd modeManuel Drehwald-1/+1
2025-04-07mitigate MSVC unsoundness by not emitting alignment attributes on win32-msvc ↵Ralf Jung-0/+2
targets also mention the MSVC alignment issue in platform-support.md
2025-04-05Update the minimum external LLVM to 19Josh Stone-23/+7
2025-03-24Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcmbors-0/+30
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics. These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19. I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something. r? `@scottmcm`
2025-03-07Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwcoMatthias Krüger-101/+40
Clean up various LLVM FFI things in codegen_llvm cc ```@ZuseZ4``` I touched some autodiff parts The major change of this PR is [bfd88ce](https://github.com/rust-lang/rust/pull/137549/commits/bfd88cead0dd79717f123ad7e9a26ecad88653cb) which makes `CodegenCx` generic just like `GenericBuilder` The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks. best reviewed commit-by-commit
2025-03-06Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsicsDaniPopes-0/+30
Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.
2025-02-24Mark more LLVM FFI as safeOli Scherer-2/+2
2025-02-24Deduplicate more functions between `SimpleCx` and `CodegenCx`Oli Scherer-49/+4
2025-02-24Make allocator shim creation mostly use safe codeOli Scherer-4/+4
2025-02-24Generalize `BackendTypes` over `GenericCx`Oli Scherer-11/+11
2025-02-24Avoid some duplication between SimpleCx and CodegenCxOli Scherer-33/+21
2025-02-24Use safe FFI for various functions in codegen_llvmOli Scherer-6/+2
2025-02-24codegen_llvm: avoid `Deref` impls w/ extern typeDavid Wood-3/+3
`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was or contained an extern type - in my experimental implementation of rust-lang/rfcs#3729, this isn't possible as the `Target` associated type's `?Sized` bound cannot be relaxed backwards compatibly (unless we come up with some way of doing this). In later pull requests with the rust-lang/rfcs#3729 implementation, breakage like this could only occur for nightly users relying on the `extern_types` feature. Upstreaming this to avoid needing to keep carrying this patch locally, and I think it'll necessarily need to change eventually.
2025-02-24Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcmbors-3/+26
Emit getelementptr inbounds nuw for pointer::add() Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative. Fixes https://github.com/rust-lang/rust/issues/137217.
2025-02-19Rework `OperandRef::extract_field` to stop calling `to_immediate_scalar` on ↵Scott McMurray-2/+8
things which are already immediates That means it stops trying to truncate things that are already `i1`s.
2025-02-19PR feedbackScott McMurray-2/+2
2025-02-19Emit `trunc nuw` for unchecked shifts and `to_immediate_scalar`Scott McMurray-2/+26
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information - Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
2025-02-19Emit getelementptr inbounds nuw for pointer::add()Nikita Popov-1/+22
2025-02-19Switch to the LLVMBuildGEPWithNoWrapFlags APINikita Popov-3/+5
This API allows us to set the nuw flag as well.
2025-02-18Auto merge of #133852 - x17jiri:cold_path, r=saethlinbors-2/+46
improve cold_path() #120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely` However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this: ``` if let Some(x) = y { ... } ``` may generate 3-target switch: ``` switch y.discriminator: 0 => true branch 1 = > false branch _ => unreachable ``` and therefore marking a branch as cold will have no effect. This PR improves `cold_path()` to work with arbitrary switch instructions. Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
2025-02-17improve cold_path()Jiri Bobek-2/+46
2025-02-13Set both `nuw` and `nsw` in slice size calculationScott McMurray-0/+31
There's an old note in the code to do this, and now that LLVM-C has an API for it, we might as well.
2025-02-12Rollup merge of #135025 - Flakebi:alloca-addrspace, r=nikicJacob Pratt-2/+4
Cast allocas to default address space Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type. Tracking issue: #135024
2025-02-10Cast allocas to default address spaceFlakebi-2/+4
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type.
2025-02-02Handle the case where the `or disjoint` folds immediately to a constantScott McMurray-1/+7
2025-01-31Override `disjoint_or` in the LLVM backendScott McMurray-0/+8
2025-01-30Rollup merge of #135026 - Flakebi:global-addrspace, r=saethlinMatthias Krüger-1/+3
Cast global variables to default address space Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for global variables is not 0. For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch: ```llvm `@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 `@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) `@alloc_0` }>, align 8 ; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)` ``` For this to compile, we need to insert a constant `addrspacecast` before we use a global variable: ```llvm `@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 `@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) `@alloc_0` to ptr) }>, align 8 ``` As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global. Tracking issue: #135024
2025-01-24Make CodegenCx and Builder genericManuel Drehwald-12/+138
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
2025-01-24Add comments about address spacesFlakebi-1/+1
2025-01-02Remove range-metadata amdgpu workaroundFlakebi-8/+0
Range metadata was disabled for amdgpu due to a backend bug. I did not encounter any problems when removing the workaround to enable range metadata (tried compiling `core` and `alloc`), so I assume this has been fixed in LLVM in the last years. Remove the workaround to re-enable range metadata.
2025-01-02Cast global variables to default address spaceFlakebi-1/+3
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for global variables is not 0. For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch: ```llvm @alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 @alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) @alloc_0 }>, align 8 ; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)` ``` For this to compile, we need to insert a constant `addrspacecast` before we use a global variable: ```llvm @alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 @alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) @alloc_0 to ptr) }>, align 8 ``` As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global.