summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src
AgeCommit message (Collapse)AuthorLines
2025-03-07Revert "Auto merge of #135335 - oli-obk:push-zxwssomxxtnq, r=saethlin"Michael Goulet-5/+0
This reverts commit a7a6c64a657f68113301c2ffe0745b49a16442d1, reversing changes made to ebbe63891f1fae21734cb97f2f863b08b1d44bf8. (cherry picked from commit a59a8f9e7579b4346eb6b00c3809d04986dcfcee)
2025-02-14Auto merge of #136575 - scottmcm:nsuw-math, r=nikicbors-0/+33
Set both `nuw` and `nsw` in slice size calculation There's an old note in the code to do this, and now that [LLVM-C has an API for it](https://github.com/llvm/llvm-project/blob/f0b8ff12519270adcfef93410abff76ab073476a/llvm/include/llvm-c/Core.h#L4403-L4408), we might as well. And it's been there since what looks like LLVM 17 https://github.com/llvm/llvm-project/commit/de9b6aa341d8951625d62ae3dac8670ebb3eb006 so doesn't even need to be conditional. (There's other places, like `RawVecInner` or `Layout`, that might want to do things like this too, but I'll leave those for a future PR.)
2025-02-14Auto merge of #137010 - workingjubilee:rollup-g00c07v, r=workingjubileebors-7/+2
Rollup of 9 pull requests Successful merges: - #135439 (Make `-O` mean `OptLevel::Aggressive`) - #136460 (Simplify `rustc_span` `analyze_source_file`) - #136904 (add `IntoBounds` trait) - #136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`) - #136924 (Add profiling of bootstrap commands using Chrome events) - #136951 (Use the right binder for rebinding `PolyTraitRef`) - #136981 (ci: switch loongarch jobs to free runners) - #136992 (Update backtrace) - #136993 ([cg_llvm] Remove dead error message) r? `@ghost` `@rustbot` modify labels: rollup
2025-02-13Rollup merge of #136993 - dpaoliello:cleanllvm4, r=workingjubileeJubilee-5/+0
[cg_llvm] Remove dead error message Part of #135502 Discovered a dead error message in rustc_codegen_llvm, so removing it. r? ``@Zalathar``
2025-02-13Set both `nuw` and `nsw` in slice size calculationScott McMurray-0/+33
There's an old note in the code to do this, and now that LLVM-C has an API for it, we might as well.
2025-02-13Rollup merge of #136895 - maurer:fix-enum-discr, r=nikicJubilee-1/+7
debuginfo: Set bitwidth appropriately in enum variant tags Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (because `dwarf::BestForm` selected `data1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
2025-02-13[cg_llvm] Remove dead error messageDaniel Paoliello-5/+0
2025-02-13Make `-O` mean `-C opt-level=3`clubby789-2/+2
2025-02-13Rollup merge of #136881 - dpaoliello:cleanllvm3, r=ZalatharJacob Pratt-421/+479
cg_llvm: Reduce visibility of all functions in the llvm module Next part of #135502 This reduces the visibility of all functions in the `llvm` module to `pub(crate)` and marks the `enzyme_ffi` modules with `#![expect(dead_code)]` (as previously discussed: <https://github.com/rust-lang/rust/pull/135502#discussion_r1915608085>). r? ``@Zalathar``
2025-02-13Rollup merge of #136858 - safinaskar:parallel-cleanup-2025-02-11-07-54, ↵Jacob Pratt-3/+0
r=SparrowLii Parallel-compiler-related cleanup Parallel-compiler-related cleanup I carefully split changes into commits. Commit messages are self-explanatory. Squashing is not recommended. cc "Parallel Rustc Front-end" https://github.com/rust-lang/rust/issues/113349 r? SparrowLii ``@rustbot`` label: +WG-compiler-parallel
2025-02-13cg_llvm: Reduce visibility of all functions in the llvm moduleDaniel Paoliello-420/+479
2025-02-13Remove `LLVMGetModuleContext`Zalathar-1/+0
This was unused after the removal of `-Zprofile` in #131829.
2025-02-12Rollup merge of #136807 - ↵Jacob Pratt-1/+0
workingjubilee:merge-gpus-to-get-the-arcradeongeforce, r=bjorn3 compiler: internally merge `PtxKernel` into `GpuKernel` r? ``@bjorn3`` for review
2025-02-12Rollup merge of #136217 - taiki-e:csky-asm-flags, r=AmanieuJacob Pratt-1/+3
Mark condition/carry bit as clobbered in C-SKY inline assembly C-SKY's compare and some arithmetic/logical instructions modify condition/carry bit (C) in PSR, but there is currently no way to mark it as clobbered in `asm!`. This PR marks it as clobbered except when [`options(preserves_flags)`](https://doc.rust-lang.org/reference/inline-assembly.html#r-asm.options.supported-options.preserves_flags) is used. Refs: - Section 1.3 "Programming model" and Section 1.3.5 "Condition/carry bit" in CSKY Architecture user_guide: https://github.com/c-sky/csky-doc/blob/9f7121f7d40970ba5cc0f15716da033db2bb9d07/CSKY%20Architecture%20user_guide.pdf > Under user mode, condition/carry bit (C) is located in the lowest bit of PSR, and it can be accessed and changed by common user instructions. It is the only data bit that can be visited under user mode in PSR. > Condition or carry bit represents the result after one operation. Condition/carry bit can be clearly set according to the results of compare instructions or unclearly set as some high-precision arithmetic or logical instructions. In addition, special instructions such as DEC[GT,LT,NE] and XTRB[0-3] will influence the value of condition/carry bit. - Register definition in LLVM: https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/CSKY/CSKYRegisterInfo.td#L88 cc ```@Dirreke``` ([target maintainer](https://github.com/rust-lang/rust/blob/aa6f5ab18e67cb815f73e0d53d217bc54b0da924/src/doc/rustc/src/platform-support/csky-unknown-linux-gnuabiv2.md#target-maintainers)) r? ```@Amanieu``` ```@rustbot``` label +O-csky +A-inline-assembly
2025-02-12Rollup merge of #135025 - Flakebi:alloca-addrspace, r=nikicJacob Pratt-2/+4
Cast allocas to default address space Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type. Tracking issue: #135024
2025-02-12debuginfo: Set bitwidth appropriately in enum variant tagsMatthew Maurer-1/+7
Previously, we unconditionally set the bitwidth to 128-bits, the largest an discrimnator would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit descriminators, so this would also have occasionally resulted in truncated data (or an assert) if more than 64-bits were used. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset also trusts the constant to describe how wide the variant tag is. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (`form1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
2025-02-12Rollup merge of #135549 - oli-obk:push-tmxtpnrloyqu, r=compiler-errorsMatthias Krüger-147/+110
Document some safety constraints and use more safe wrappers Lots of unsafe codegen_llvm code has safe wrappers already, so I used some of them and added some where applicable. I stopped here because this diff is large enough and should probably be reviewed independently of other changes.
2025-02-11Document some safety constraints and use more safe wrappersOli Scherer-59/+50
2025-02-11Add a safe wrapper for `WriteBitcodeToFile`Oli Scherer-8/+10
2025-02-11Remove an unsafe closure invariant by inlining the closure wrapper into the ↵Oli Scherer-80/+50
called function
2025-02-11compiler/rustc_codegen_llvm/src/lib.rs: remove "unsafe impl Send/Sync"Askar Safin-3/+0
2025-02-11Rollup merge of #136813 - mrkajetanp:aarch32-fp16-target-feature, r=davidtwcoJacob Pratt-0/+1
rustc_target: Add the fp16 target feature for AArch32 As in the commit description. The feature is already available in rustc for AArch64.
2025-02-11Rollup merge of #136721 - dpaoliello:cleanllvm2, r=ZalatharJacob Pratt-8/+8
cg_llvm: Reduce visibility of some items outside the `llvm` module Next piece of #135502 This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.
2025-02-10Cast allocas to default address spaceFlakebi-2/+4
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used. This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0. For example the following code compiles incorrectly when not casting the address space to the default one: ```rust fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ { let local = 0i8; /* addrspace(5) */ let res = if cond { p } else { &raw const local }; res } ``` results in ```llvm %local = alloca addrspace(5) i8 %res = alloca addrspace(5) ptr if: ; Store 64-bit flat pointer store ptr %p, ptr addrspace(5) %res else: ; Store 32-bit scratch pointer store ptr addrspace(5) %local, ptr addrspace(5) %res ret: ; Load and return 64-bit flat pointer %res.load = load ptr, ptr addrspace(5) %res ret ptr %res.load ``` For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers. The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type.
2025-02-10rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm moduleDaniel Paoliello-8/+8
2025-02-10Rollup merge of #136419 - EnzymeAD:autodiff-tests, r=onur-ozkan,jieyouxuMatthias Krüger-49/+64
adding autodiff tests I'd like to get started with upstreaming some tests, even though I'm still waiting for an answer on how to best integrate the enzyme pass. Can we therefore temporarily support the -Z llvm-plugins here without too much effort? And in that case, how would that work? I saw you can do remapping, e.g. `rust-src-base`, but I don't think that will give me the path to libEnzyme.so. Do you have another suggestion? Other than that this test simply checks that the derivative of `x*x` is `2.0 * x`, which in this case is computed as `%0 = fadd fast double %x.0.val, %x.0.val` (I'll add a few more tests and move it to an autodiff folder if we can use the -Z flag) r? ``@jieyouxu`` Locally at least `-Zllvm-plugins=${PWD}/build/x86_64-unknown-linux-gnu/enzyme/build/Enzyme/libEnzyme-19.so` seems to work if I copy the command I get from x.py test and run it manually. However, running x.py test itself fails. Tracking: - https://github.com/rust-lang/rust/issues/124509 Zulip discussion: https://rust-lang.zulipchat.com/#narrow/channel/326414-t-infra.2Fbootstrap/topic/Enzyme.20build.20changes
2025-02-10Rollup merge of #136053 - Zalathar:defer-counters, r=saethlinJubilee-48/+31
coverage: Defer part of counter-creation until codegen Follow-up to #135481 and #135873. One of the pleasant properties of the new counter-assignment algorithm is that we can stop partway through the process, store the intermediate state in MIR, and then resume the rest of the algorithm during codegen. This lets it take into account which parts of the control-flow graph were eliminated by MIR opts, resulting in fewer physical counters and simpler counter expressions. Those improvements end up completely obsoleting much larger chunks of code that were previously responsible for cleaning up the coverage metadata after MIR opts, while also doing a more thorough cleanup job. (That change also unlocks some further simplifications that I've kept out of this PR to limit its scope.)
2025-02-09compiler: internally merge `Conv::PtxKernel` into `GpuKernel`Jubilee Young-1/+0
It is speculated that these two can be conceptually merged, and it can start by ripping out rustc's notion of the PtxKernel call convention. Leave the ExternAbi for now, but the nvptx target now should see it as just a different way to spell Conv::GpuKernel.
2025-02-10remove outdated *First autodiff variants for higher-order adManuel Drehwald-2/+0
2025-02-10move second opt run to lto phase and cleanup codeManuel Drehwald-46/+55
2025-02-09Auto merge of #136751 - bjorn3:update_rustfmt, r=Mark-Simulacrumbors-264/+290
Update bootstrap compiler and rustfmt The rustfmt version we previously used formats things differently from what the latest nightly rustfmt does. This causes issues for subtrees that get formatted both in-tree and in their own repo. Updating the rustfmt used in-tree solves those issues. Also bumped the bootstrap compiler as the stage0 update command always updates both at the same time.
2025-02-09Auto merge of #136754 - Urgau:rollup-qlkhjqr, r=Urgaubors-1/+5
Rollup of 5 pull requests Successful merges: - #134679 (Windows: remove readonly files) - #136213 (Allow Rust to use a number of libc filesystem calls) - #136530 (Implement `x perf` directly in bootstrap) - #136601 (Detect (non-raw) borrows of null ZST pointers in CheckNull) - #136659 (Pick the max DWARF version when LTO'ing modules with different versions ) r? `@ghost` `@rustbot` modify labels: rollup
2025-02-08Rollup merge of #136706 - workingjubilee:finish-up-rustc-abi-updates, ↵Jubilee-6/+5
r=compiler-errors compiler: mostly-finish `rustc_abi` updates This almost-finishes all the updates in the compiler to use `rustc_abi` and removes some of the reexports of `rustc_abi` items in `rustc_target` that were previously available. r? ```@compiler-errors```
2025-02-09Rollup merge of #136659 - wesleywiser:dwarf_version_lto_merge_behavior, ↵Urgau-1/+5
r=jieyouxu Pick the max DWARF version when LTO'ing modules with different versions Currently, when rustc compiles code with `-Clto` enabled that was built with different choices for `-Zdwarf-version`, a warning will be reported. It's very easy to observe this by compiling most anything (eg, "hello world") and specifying `-Clto -Zdwarf-version=5` since the standard library is distributed with `-Zdwarf-version=4`. This behavior isn't actually useful for a few reasons: - From observation, LLVM chooses to pick the highest DWARF version anyway after issuing the warning. - Clang specifies that in this case, the max version should be picked without a warning and as a general principle, we want to support x-lang LTO with Clang which implies using the same module flag merge behaviors. - Debuggers need to be able to handle a variety of versions within the same debugging session as you can easily have some parts of a binary (or some dynamic libraries within an application) all compiled with different DWARF versions. This commit changes the module flag merge behavior to match Clang and use the highest version of DWARF. It also adds a test to ensure this behavior is respected in the case of two crates being LTO'd together and adds a test to ensure no warning is printed. Fixes #130041 which fails due to these warnings being printed cc #103057
2025-02-08Rustfmtbjorn3-264/+290
2025-02-08Pick the max DWARF version when LTO'ing modules with different versionsWesley Wiser-1/+5
Currently, when rustc compiles code with `-Clto` enabled that was built with different choices for `-Zdwarf-version`, a warning will be reported. It's very easy to observe this by compiling most anything (eg, "hello world") and specifying `-Clto -Zdwarf-version=5` since the standard library is distributed with `-Zdwarf-version=4`. This behavior isn't actually useful for a few reasons: - from observation, LLVM chooses to pick the highest DWARF version anyway after issuing the warning - Clang specifies that in this case, the max version should be picked without a warning and as a general principle, we want to support x-lang LTO with Clang which implies using the same module flag merge behaviors - Debuggers need to be able to handle a variety of versions withing the same debugging session as you can easily have some parts of a binary (or some dynamic libraries within an application) all compiled with different DWARF versions This commit changes the module flag merge behavior to match Clang and use the highest version of DWARF. It also adds a test to ensure this behavior is respected in the case of two crates being LTO'd together and updates the test added in the previous commit to ensure no warning is printed.
2025-02-07fix non-enzyme buildsManuel Drehwald-1/+4
2025-02-08Rollup merge of #136691 - bjorn3:linkage_cleanup, r=jieyouxuMatthias Krüger-6/+1
Remove Linkage::Private and Linkage::Appending Neither of them has any use case. Neither known nor theoretical.
2025-02-08Rollup merge of #136640 - Zalathar:debuginfo-align-bits, r=compiler-errorsMatthias Krüger-8/+5
Debuginfo for function ZSTs should have alignment of 8 bits, not 1 bit In #116096, function ZSTs were made to have debuginfo that gives them an alignment of “1”. But because alignment in LLVM debuginfo is denoted in *bits*, not bytes, this resulted in an alignment specification of 1 bit instead of 1 byte. I don't know whether this has any practical consequences, but I noticed that a test started failing when I accidentally fixed the mistake while working on #136632, so I extracted the fix (and the test adjustment) to this PR.
2025-02-07compiler: remove reexports from rustc_target::callconvJubilee Young-6/+5
2025-02-07rustc_target: Add the fp16 target feature for AArch32Kajetan Puchalski-0/+1
2025-02-07Remove Linkage::Appendingbjorn3-1/+0
It can only be used for certain LLVM internal variables like llvm.global_ctors which users are not allowed to define.
2025-02-07Remove Linkage::Privatebjorn3-5/+1
This is the same as Linkage::Internal except that it doesn't emit any symbol. Some backends may not support it and it isn't all that useful anyway.
2025-02-06Remove dead code from rustc_codegen_llvm and the LLVM wrapperDaniel Paoliello-29/+0
2025-02-06Debuginfo for function ZSTs should have alignment of 8 bits, not 1 bitZalathar-8/+5
2025-02-06Auto merge of #136471 - safinaskar:parallel, r=SparrowLiibors-2/+2
tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc` tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc` This is continuation of https://github.com/rust-lang/rust/pull/132282 . I'm pretty sure I did everything right. In particular, I searched all occurrences of `Lrc` in submodules and made sure that they don't need replacement. There are other possibilities, through. We can define `enum Lrc<T> { Rc(Rc<T>), Arc(Arc<T>) }`. Or we can make `Lrc` a union and on every clone we can read from special thread-local variable. Or we can add a generic parameter to `Lrc` and, yes, this parameter will be everywhere across all codebase. So, if you think we should take some alternative approach, then don't merge this PR. But if it is decided to stick with `Arc`, then, please, merge. cc "Parallel Rustc Front-end" ( https://github.com/rust-lang/rust/issues/113349 ) r? SparrowLii `@rustbot` label WG-compiler-parallel
2025-02-06coverage: Remove the old code for simplifying counters after MIR optsZalathar-19/+10
2025-02-06coverage: Defer part of counter-creation until codegenZalathar-23/+10
2025-02-06coverage: Store BCB node IDs in mappings, and resolve them in codegenZalathar-12/+17
Even though the coverage graph itself is no longer available during codegen, its nodes can still be used as opaque IDs.
2025-02-06Remove some unused glob re-exportsZalathar-4/+0
These were detected by temporarily making `mod llvm` non-public.