about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src/builder.rs
AgeCommit message (Collapse)AuthorLines
2025-02-19PR feedbackScott McMurray-2/+2
2025-02-19Emit `trunc nuw` for unchecked shifts and `to_immediate_scalar`Scott McMurray-2/+26
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information - Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
2025-02-19Emit getelementptr inbounds nuw for pointer::add()Nikita Popov-1/+22
2025-02-19Switch to the LLVMBuildGEPWithNoWrapFlags APINikita Popov-3/+5
This API allows us to set the nuw flag as well.
2025-02-18Auto merge of #133852 - x17jiri:cold_path, r=saethlinbors-2/+46
improve cold_path() #120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely` However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this: ``` if let Some(x) = y { ... } ``` may generate 3-target switch: ``` switch y.discriminator: 0 => true branch 1 = > false branch _ => unreachable ``` and therefore marking a branch as cold will have no effect. This PR improves `cold_path()` to work with arbitrary switch instructions. Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
2025-02-17improve cold_path()Jiri Bobek-2/+46
2025-02-13Set both `nuw` and `nsw` in slice size calculationScott McMurray-0/+31
There's an old note in the code to do this, and now that LLVM-C has an API for it, we might as well.
2025-02-12Rollup merge of #135025 - Flakebi:alloca-addrspace, r=nikicJacob Pratt-2/+4
Cast allocas to default address space Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type. Tracking issue: #135024
2025-02-10Cast allocas to default address spaceFlakebi-2/+4
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type.
2025-02-02Handle the case where the `or disjoint` folds immediately to a constantScott McMurray-1/+7
2025-01-31Override `disjoint_or` in the LLVM backendScott McMurray-0/+8
2025-01-30Rollup merge of #135026 - Flakebi:global-addrspace, r=saethlinMatthias Krüger-1/+3
Cast global variables to default address space Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for global variables is not 0. For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch: ```llvm `@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 `@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) `@alloc_0` }>, align 8 ; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)` ``` For this to compile, we need to insert a constant `addrspacecast` before we use a global variable: ```llvm `@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 `@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) `@alloc_0` to ptr) }>, align 8 ``` As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global. Tracking issue: #135024
2025-01-24Make CodegenCx and Builder genericManuel Drehwald-12/+138
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
2025-01-24Add comments about address spacesFlakebi-1/+1
2025-01-02Remove range-metadata amdgpu workaroundFlakebi-8/+0
Range metadata was disabled for amdgpu due to a backend bug. I did not encounter any problems when removing the workaround to enable range metadata (tried compiling `core` and `alloc`), so I assume this has been fixed in LLVM in the last years. Remove the workaround to re-enable range metadata.
2025-01-02Cast global variables to default address spaceFlakebi-1/+3
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for global variables is not 0. For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch: ```llvm @alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 @alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) @alloc_0 }>, align 8 ; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)` ``` For this to compile, we need to insert a constant `addrspacecast` before we use a global variable: ```llvm @alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1 @alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) @alloc_0 to ptr) }>, align 8 ``` As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global.
2025-01-01upstream rustc_codegen_llvm changes for enzyme/autodiffManuel Drehwald-0/+2
2024-11-18use `TypingEnv` when no `infcx` is availablelcnr-5/+5
the behavior of the type system not only depends on the current assumptions, but also the currentnphase of the compiler. This is mostly necessary as we need to decide whether and how to reveal opaque types. We track this via the `TypingMode`.
2024-11-11CFI: Append debug location to CFI blocksBastian Kersting-0/+7
2024-11-03compiler: Directly use rustc_abi in codegenJubilee Young-1/+1
2024-10-30Rollup merge of #132246 - workingjubilee:campaign-on-irform, r=compiler-errorsJubilee-2/+2
Rename `rustc_abi::Abi` to `BackendRepr` Remove the confabulation of `rustc_abi::Abi` with what "ABI" actually means by renaming it to `BackendRepr`, and rename `Abi::Aggregate` to `BackendRepr::Memory`. The type never actually represented how things are passed, as that has to have `PassMode` considered, at minimum, but rather it just is how we represented some things to the backend. This conflation arose because LLVM, the primary backend at the time, would lower certain IR forms using certain ABIs. Even that only somewhat was true, as it broke down when one ventured significantly afield of what is described by the System V AMD64 ABI either by using different architectures, ABI-modifying IR annotations, the same architecture **with different ISA extensions enabled**, or other... unexpected delights. Unfortunately both names are still somewhat of a misnomer right now, as people have written code for years based on this misunderstanding. Still, their original names are even moreso, and for better or worse, this backend code hasn't received as much maintenance as the rest of the compiler, lately. Actually arriving at a correct end-state will simply require us to disentangle a lot of code in order to fix, much of it pointlessly repeated in several places. Thus this is not an "actual fix", just a way to deflect further misunderstandings.
2024-10-30Clean up FFI calls for operand bundlesZalathar-14/+9
2024-10-29compiler: `rustc_abi::Abi` => `BackendRepr`Jubilee Young-2/+2
The initial naming of "Abi" was an awful mistake, conveying wrong ideas about how psABIs worked and even more about what the enum meant. It was only meant to represent the way the value would be described to a codegen backend as it was lowered to that intermediate representation. It was never meant to mean anything about the actual psABI handling! The conflation is because LLVM typically will associate a certain form with a certain ABI, but even that does not hold when the special cases that actually exist arise, plus the IR annotations that modify the ABI. Reframe `rustc_abi::Abi` as the `BackendRepr` of the type, and rename `BackendRepr::Aggregate` as `BackendRepr::Memory`. Unfortunately, due to the persistent misunderstandings, this too is now incorrect: - Scattered ABI-relevant code is entangled with BackendRepr - We do not always pre-compute a correct BackendRepr that reflects how we "actually" want this value to be handled, so we leave the backend interface to also inject various special-cases here - In some cases `BackendRepr::Memory` is a "real" aggregate, but in others it is in fact using memory, and in some cases it is a scalar! Our rustc-to-backend lowering code handles this sort of thing right now. That will eventually be addressed by lifting duplicated lowering code to either rustc_codegen_ssa or rustc_target as appropriate.
2024-10-25coverage: SSA doesn't need to know about `instrprof_increment`Zalathar-11/+12
2024-10-25coverage: Emit MC/DC intrinsics using the normal helper methodZalathar-46/+9
2024-10-25coverage: Emit `llvm.instrprof.increment` using the normal helper methodZalathar-24/+2
2024-10-08compiler: Factor rustc_target::abi out of cg_llvmJubilee Young-4/+5
2024-10-08coverage. Adapt to mcdc mapping formats introduced by llvm 19zhuyunxing-50/+13
2024-10-08coverage. Disable supporting mcdc on llvm-18zhuyunxing-0/+13
2024-09-24Auto merge of #130389 - Luv-Ray:LLVMMDNodeInContext2, r=nikicbors-39/+20
llvm: replace some deprecated functions `LLVMMDStringInContext` and `LLVMMDNodeInContext` are deprecated, replace them with `LLVMMDStringInContext2` and `LLVMMDNodeInContext2`. Also replace `Value` with `Metadata` in some function signatures for better consistency.
2024-09-22Reformat using the new identifier sorting from rustfmtMichael Goulet-1/+1
2024-09-19MetadataType type castLuv-Ray-8/+7
2024-09-19wrap `LLVMSetMetadata`Luv-Ray-19/+12
2024-09-19Reformat some comments.Nicholas Nethercote-5/+5
So they are less than 100 chars.
2024-09-19Use a macro to factor out some repetitive code.Nicholas Nethercote-80/+27
Similar to the existing macro just above.
2024-09-19replace some deprecated functionsLuv-Ray-36/+25
2024-09-18Update the minimum external LLVM to 18Josh Stone-14/+4
2024-09-17Use associative type defaults in `{Layout,FnAbi}OfHelpers`.Nicholas Nethercote-4/+0
This avoids some repetitive boilerplate code.
2024-09-17Merge `HasCodegen` into `BuilderMethods`.Nicholas Nethercote-4/+2
It has `Backend` and `Deref` boudns, plus an associated type `CodegenCx`, and it has a single use. This commit "inlines" it into `BuilderMethods`, which makes the complicated backend trait situation a little simpler.
2024-08-16Add `warn(unreachable_pub)` to `rustc_codegen_llvm`.Nicholas Nethercote-23/+35
2024-08-12Rollup merge of #128149 - RalfJung:nontemporal_store, r=jieyouxu,Amanieu,JubileeGuillaume Gomez-7/+26
nontemporal_store: make sure that the intrinsic is truly just a hint The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](https://github.com/llvm/llvm-project/issues/64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores. ~~Blocked on https://github.com/rust-lang/stdarch/pull/1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~ Fixes https://github.com/rust-lang/rust/issues/114582 Cc `@Amanieu` `@workingjubilee`
2024-08-12make the codegen test also cover an ill-behaved arch, and add linksRalf Jung-0/+2
2024-08-08Rename struct_tail_erasing_lifetimes to struct_tail_for_codegenMichael Goulet-1/+1
2024-08-08Do normalize when computing struct tails in codegenMichael Goulet-2/+3
2024-08-05RISC-V also has sane nontemporal storesRalf Jung-1/+2
2024-08-05nontemporal_store: make sure that the intrinsic is truly just a hintRalf Jung-7/+23
2024-07-30Auto merge of #128250 - Amanieu:select_unpredictable, r=nikicbors-0/+10
Add `select_unpredictable` to force LLVM to use CMOV Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute. This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.
2024-07-29Reformat `use` declarations.Nicholas Nethercote-15/+17
The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.
2024-07-28Force LLVM to use CMOV for binary searchAmanieu d'Antras-0/+10
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute. This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.
2024-05-23Remove `#[macro_use] extern crate tracing` from `rustc_codegen_llvm`.Nicholas Nethercote-0/+1