about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src
AgeCommit message (Collapse)AuthorLines
2025-02-24Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcmbors-12/+39
Emit getelementptr inbounds nuw for pointer::add() Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative. Fixes https://github.com/rust-lang/rust/issues/137217.
2025-02-23Rollup merge of #136543 - RalfJung:round-ties-even, r=tgross35Trevor Gross-14/+8
intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that. Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35` try-job: test-various
2025-02-22Auto merge of #137420 - matthiaskrgr:rollup-rr0q37f, r=matthiaskrgrbors-2/+2
Rollup of 9 pull requests Successful merges: - #136910 (Implement feature `isolate_most_least_significant_one` for integer types) - #137183 (Prune dead regionck code) - #137333 (Use `edition = "2024"` in the compiler (redux)) - #137356 (Ferris 🦀 Identifier naming conventions) - #137362 (Add build step log for `run-make-support`) - #137377 (Always allow reusing cratenum in CrateLoader::load) - #137388 (Fix(lib/fs/tests): Disable rename POSIX semantics FS tests under Windows 7) - #137410 (Use StableHasher + Hash64 for dep_tracking_hash) - #137413 (jubilee cleared out the review queue) r? `@ghost` `@rustbot` modify labels: rollup
2025-02-21update autodiff flagsManuel Drehwald-28/+169
2025-02-21clean up autodiff code/commentsManuel Drehwald-10/+5
2025-02-22Fix overcapturing, unsafe extern blocks, and new unsafe opsMichael Goulet-2/+2
2025-02-21Rollup merge of #137313 - oli-obk:push-ywvuqkxuqyom, r=petrochenkovMatthias Krüger-85/+79
Some codegen_llvm cleanups Using some more safe wrappers and thus being able to remove a large unsafe block. As a next step we should probably look into safe extern fns
2025-02-20Remove `BackendRepr::Uninhabited`, replaced with an `uninhabited: bool` ↵Zachary S-9/+6
field in `LayoutData`. Also update comments that refered to BackendRepr::Uninhabited.
2025-02-20Merge two operations that were always performed togetherOli Scherer-76/+70
2025-02-20Create safe helper for LLVMSetDLLStorageClassOli Scherer-10/+10
2025-02-19Rework `OperandRef::extract_field` to stop calling `to_immediate_scalar` on ↵Scott McMurray-2/+8
things which are already immediates That means it stops trying to truncate things that are already `i1`s.
2025-02-19PR feedbackScott McMurray-3/+2
2025-02-19Emit `trunc nuw` for unchecked shifts and `to_immediate_scalar`Scott McMurray-2/+27
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information - Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
2025-02-19Emit getelementptr inbounds nuw for pointer::add()Nikita Popov-1/+22
2025-02-19Switch to the LLVMBuildGEPWithNoWrapFlags APINikita Popov-12/+18
This API allows us to set the nuw flag as well.
2025-02-19Rollup merge of #137210 - workingjubilee:fixup-passmode-import, r=RalfJungMatthias Krüger-27/+27
compiler: Stop reexporting stuff in cg_llvm::abi The reexports confuse tooling like rustdoc into thinking cg_llvm is the source of key types that originate in rustc_target.
2025-02-18compiler: Stop reexporting stuff in cg_llvm::abiJubilee Young-27/+27
The reexports confuse tooling like rustdoc into thinking cg_llvm is the source of key types that originate in rustc_target.
2025-02-18Auto merge of #133852 - x17jiri:cold_path, r=saethlinbors-2/+46
improve cold_path() #120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely` However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this: ``` if let Some(x) = y { ... } ``` may generate 3-target switch: ``` switch y.discriminator: 0 => true branch 1 = > false branch _ => unreachable ``` and therefore marking a branch as cold will have no effect. This PR improves `cold_path()` to work with arbitrary switch instructions. Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
2025-02-18Move methods from `Map` to `TyCtxt`, part 2.Nicholas Nethercote-1/+1
Continuing the work started in #136466. Every method gains a `hir_` prefix, though for the ones that already have a `par_` or `try_par_` prefix I added the `hir_` after that.
2025-02-17improve cold_path()Jiri Bobek-2/+46
2025-02-17Rollup merge of #137095 - saethlin:use-hash64-for-hashes, r=workingjubileeMatthias Krüger-1/+2
Replace some u64 hashes with Hash64 I introduced the Hash64 and Hash128 types in https://github.com/rust-lang/rust/pull/110083, essentially as a mechanism to prevent hashes from landing in our leb128 encoding paths. If you just have a u64 or u128 field in a struct then derive Encodable/Decodable, that number gets leb128 encoding. So if you need to store a hash or some other value which behaves very close to a hash, don't store it as a u64. This reverts part of https://github.com/rust-lang/rust/pull/117603, which turned an encoded Hash64 into a u64. Based on https://github.com/rust-lang/rust/pull/110083, I don't expect this to be perf-sensitive on its own, though I expect that it may help stabilize some of the small rmeta size fluctuations we currently see in perf reports.
2025-02-16Move hashes from rustc_data_structure to rustc_hashes so they can be shared ↵Ben Kimock-1/+2
with rust-analyzer
2025-02-16Rollup merge of #136545 - durin42:nvptx64-align, r=nikicJacob Pratt-0/+6
nvptx64: update default alignment to match LLVM 21 This changed in llvm/llvm-project@91cb8f5d3202870602c6bef807bc4c7ae8a32790. The commit itself is mostly about some intrinsic instructions, but as an aside it also mentions something about addrspace for tensor memory, which I believe is what this string is telling us. `@rustbot` label: +llvm-main
2025-02-14Auto merge of #136575 - scottmcm:nsuw-math, r=nikicbors-0/+33
Set both `nuw` and `nsw` in slice size calculation There's an old note in the code to do this, and now that [LLVM-C has an API for it](https://github.com/llvm/llvm-project/blob/f0b8ff12519270adcfef93410abff76ab073476a/llvm/include/llvm-c/Core.h#L4403-L4408), we might as well. And it's been there since what looks like LLVM 17 https://github.com/llvm/llvm-project/commit/de9b6aa341d8951625d62ae3dac8670ebb3eb006 so doesn't even need to be conditional. (There's other places, like `RawVecInner` or `Layout`, that might want to do things like this too, but I'll leave those for a future PR.)
2025-02-14Auto merge of #137010 - workingjubilee:rollup-g00c07v, r=workingjubileebors-7/+2
Rollup of 9 pull requests Successful merges: - #135439 (Make `-O` mean `OptLevel::Aggressive`) - #136460 (Simplify `rustc_span` `analyze_source_file`) - #136904 (add `IntoBounds` trait) - #136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`) - #136924 (Add profiling of bootstrap commands using Chrome events) - #136951 (Use the right binder for rebinding `PolyTraitRef`) - #136981 (ci: switch loongarch jobs to free runners) - #136992 (Update backtrace) - #136993 ([cg_llvm] Remove dead error message) r? `@ghost` `@rustbot` modify labels: rollup
2025-02-13Rollup merge of #136993 - dpaoliello:cleanllvm4, r=workingjubileeJubilee-5/+0
[cg_llvm] Remove dead error message Part of #135502 Discovered a dead error message in rustc_codegen_llvm, so removing it. r? ``@Zalathar``
2025-02-13Set both `nuw` and `nsw` in slice size calculationScott McMurray-0/+33
There's an old note in the code to do this, and now that LLVM-C has an API for it, we might as well.
2025-02-13Rollup merge of #136895 - maurer:fix-enum-discr, r=nikicJubilee-1/+7
debuginfo: Set bitwidth appropriately in enum variant tags Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (because `dwarf::BestForm` selected `data1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
2025-02-13[cg_llvm] Remove dead error messageDaniel Paoliello-5/+0
2025-02-13Make `-O` mean `-C opt-level=3`clubby789-2/+2
2025-02-13Rollup merge of #136881 - dpaoliello:cleanllvm3, r=ZalatharJacob Pratt-421/+479
cg_llvm: Reduce visibility of all functions in the llvm module Next part of #135502 This reduces the visibility of all functions in the `llvm` module to `pub(crate)` and marks the `enzyme_ffi` modules with `#![expect(dead_code)]` (as previously discussed: <https://github.com/rust-lang/rust/pull/135502#discussion_r1915608085>). r? ``@Zalathar``
2025-02-13Rollup merge of #136858 - safinaskar:parallel-cleanup-2025-02-11-07-54, ↵Jacob Pratt-3/+0
r=SparrowLii Parallel-compiler-related cleanup Parallel-compiler-related cleanup I carefully split changes into commits. Commit messages are self-explanatory. Squashing is not recommended. cc "Parallel Rustc Front-end" https://github.com/rust-lang/rust/issues/113349 r? SparrowLii ``@rustbot`` label: +WG-compiler-parallel
2025-02-13cg_llvm: Reduce visibility of all functions in the llvm moduleDaniel Paoliello-420/+479
2025-02-13Remove `LLVMGetModuleContext`Zalathar-1/+0
This was unused after the removal of `-Zprofile` in #131829.
2025-02-12Rollup merge of #136807 - ↵Jacob Pratt-1/+0
workingjubilee:merge-gpus-to-get-the-arcradeongeforce, r=bjorn3 compiler: internally merge `PtxKernel` into `GpuKernel` r? ``@bjorn3`` for review
2025-02-12Rollup merge of #136217 - taiki-e:csky-asm-flags, r=AmanieuJacob Pratt-1/+3
Mark condition/carry bit as clobbered in C-SKY inline assembly C-SKY's compare and some arithmetic/logical instructions modify condition/carry bit (C) in PSR, but there is currently no way to mark it as clobbered in `asm!`. This PR marks it as clobbered except when [`options(preserves_flags)`](https://doc.rust-lang.org/reference/inline-assembly.html#r-asm.options.supported-options.preserves_flags) is used. Refs: - Section 1.3 "Programming model" and Section 1.3.5 "Condition/carry bit" in CSKY Architecture user_guide: https://github.com/c-sky/csky-doc/blob/9f7121f7d40970ba5cc0f15716da033db2bb9d07/CSKY%20Architecture%20user_guide.pdf > Under user mode, condition/carry bit (C) is located in the lowest bit of PSR, and it can be accessed and changed by common user instructions. It is the only data bit that can be visited under user mode in PSR. > Condition or carry bit represents the result after one operation. Condition/carry bit can be clearly set according to the results of compare instructions or unclearly set as some high-precision arithmetic or logical instructions. In addition, special instructions such as DEC[GT,LT,NE] and XTRB[0-3] will influence the value of condition/carry bit. - Register definition in LLVM: https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/CSKY/CSKYRegisterInfo.td#L88 cc ```@Dirreke``` ([target maintainer](https://github.com/rust-lang/rust/blob/aa6f5ab18e67cb815f73e0d53d217bc54b0da924/src/doc/rustc/src/platform-support/csky-unknown-linux-gnuabiv2.md#target-maintainers)) r? ```@Amanieu``` ```@rustbot``` label +O-csky +A-inline-assembly
2025-02-12Rollup merge of #135025 - Flakebi:alloca-addrspace, r=nikicJacob Pratt-2/+4
Cast allocas to default address space Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type. Tracking issue: #135024
2025-02-12debuginfo: Set bitwidth appropriately in enum variant tagsMatthew Maurer-1/+7
Previously, we unconditionally set the bitwidth to 128-bits, the largest an discrimnator would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit descriminators, so this would also have occasionally resulted in truncated data (or an assert) if more than 64-bits were used. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset also trusts the constant to describe how wide the variant tag is. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (`form1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
2025-02-12Rollup merge of #135549 - oli-obk:push-tmxtpnrloyqu, r=compiler-errorsMatthias Krüger-147/+110
Document some safety constraints and use more safe wrappers Lots of unsafe codegen_llvm code has safe wrappers already, so I used some of them and added some where applicable. I stopped here because this diff is large enough and should probably be reviewed independently of other changes.
2025-02-11Document some safety constraints and use more safe wrappersOli Scherer-59/+50
2025-02-11Add a safe wrapper for `WriteBitcodeToFile`Oli Scherer-8/+10
2025-02-11Remove an unsafe closure invariant by inlining the closure wrapper into the ↵Oli Scherer-80/+50
called function
2025-02-11compiler/rustc_codegen_llvm/src/lib.rs: remove "unsafe impl Send/Sync"Askar Safin-3/+0
2025-02-11Rollup merge of #136813 - mrkajetanp:aarch32-fp16-target-feature, r=davidtwcoJacob Pratt-0/+1
rustc_target: Add the fp16 target feature for AArch32 As in the commit description. The feature is already available in rustc for AArch64.
2025-02-11Rollup merge of #136721 - dpaoliello:cleanllvm2, r=ZalatharJacob Pratt-8/+8
cg_llvm: Reduce visibility of some items outside the `llvm` module Next piece of #135502 This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.
2025-02-10Cast allocas to default address spaceFlakebi-2/+4
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type.
2025-02-10rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm moduleDaniel Paoliello-8/+8
2025-02-10Rollup merge of #136419 - EnzymeAD:autodiff-tests, r=onur-ozkan,jieyouxuMatthias Krüger-49/+64
adding autodiff tests I'd like to get started with upstreaming some tests, even though I'm still waiting for an answer on how to best integrate the enzyme pass. Can we therefore temporarily support the -Z llvm-plugins here without too much effort? And in that case, how would that work? I saw you can do remapping, e.g. `rust-src-base`, but I don't think that will give me the path to libEnzyme.so. Do you have another suggestion? Other than that this test simply checks that the derivative of `x*x` is `2.0 * x`, which in this case is computed as `%0 = fadd fast double %x.0.val, %x.0.val` (I'll add a few more tests and move it to an autodiff folder if we can use the -Z flag) r? ``@jieyouxu`` Locally at least `-Zllvm-plugins=${PWD}/build/x86_64-unknown-linux-gnu/enzyme/build/Enzyme/libEnzyme-19.so` seems to work if I copy the command I get from x.py test and run it manually. However, running x.py test itself fails. Tracking: - https://github.com/rust-lang/rust/issues/124509 Zulip discussion: https://rust-lang.zulipchat.com/#narrow/channel/326414-t-infra.2Fbootstrap/topic/Enzyme.20build.20changes
2025-02-10Rollup merge of #136053 - Zalathar:defer-counters, r=saethlinJubilee-48/+31
coverage: Defer part of counter-creation until codegen Follow-up to #135481 and #135873. One of the pleasant properties of the new counter-assignment algorithm is that we can stop partway through the process, store the intermediate state in MIR, and then resume the rest of the algorithm during codegen. This lets it take into account which parts of the control-flow graph were eliminated by MIR opts, resulting in fewer physical counters and simpler counter expressions. Those improvements end up completely obsoleting much larger chunks of code that were previously responsible for cleaning up the coverage metadata after MIR opts, while also doing a more thorough cleanup job. (That change also unlocks some further simplifications that I've kept out of this PR to limit its scope.)
2025-02-09compiler: internally merge `Conv::PtxKernel` into `GpuKernel`Jubilee Young-1/+0
It is speculated that these two can be conceptually merged, and it can start by ripping out rustc's notion of the PtxKernel call convention. Leave the ExternAbi for now, but the nvptx target now should see it as just a different way to spell Conv::GpuKernel.