about summary refs log tree commit diff
path: root/compiler/rustc_codegen_ssa/src/traits
AgeCommit message (Collapse)AuthorLines
2025-03-11Auto merge of #137586 - nnethercote:SetImpliedBits, r=bjorn3bors-3/+6
Speed up target feature computation The LLVM backend calls `LLVMRustHasFeature` twice for every feature. In short-running rustc invocations, this accounts for a surprising amount of work. r? `@bjorn3`
2025-03-07Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwcoMatthias Krüger-11/+8
Clean up various LLVM FFI things in codegen_llvm cc ```@ZuseZ4``` I touched some autodiff parts The major change of this PR is [bfd88ce](https://github.com/rust-lang/rust/pull/137549/commits/bfd88cead0dd79717f123ad7e9a26ecad88653cb) which makes `CodegenCx` generic just like `GenericBuilder` The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks. best reviewed commit-by-commit
2025-03-05Change signature of `target_features_cfg`.Nicholas Nethercote-3/+6
Currently it is called twice, once with `allow_unstable` set to true and once with it set to false. This results in some duplicated work. Most notably, for the LLVM backend, `LLVMRustHasFeature` is called twice for every feature, and it's moderately slow. For very short running compilations on platforms with many features (e.g. a `check` build of hello-world on x86) this is a significant fraction of runtime. This commit changes `target_features_cfg` so it is only called once, and it now returns a pair of feature sets. This halves the number of `LLVMRustHasFeature` calls.
2025-03-02Revert "Auto merge of #135335 - oli-obk:push-zxwssomxxtnq, r=saethlin"Michael Goulet-1/+0
This reverts commit a7a6c64a657f68113301c2ffe0745b49a16442d1, reversing changes made to ebbe63891f1fae21734cb97f2f863b08b1d44bf8.
2025-03-01Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikicbors-1/+1
The embedded bitcode should always be prepared for LTO/ThinLTO Fixes #115344. Fixes #117220. There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`. When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module. This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`. r? nikic
2025-02-24Remove an unused lifetime paramOli Scherer-2/+2
2025-02-24Generalize BaseTypeCodegenMethodsOli Scherer-6/+3
2025-02-24Remove an unnecessary lifetimeOli Scherer-3/+3
2025-02-24ssa/mono: deduplicate `type_has_metadata`David Wood-14/+1
The implementation of the `type_has_metadata` function is duplicated in `rustc_codegen_ssa` and `rustc_monomorphize`, so move this to `rustc_middle`.
2025-02-24Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcmbors-0/+8
Emit getelementptr inbounds nuw for pointer::add() Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative. Fixes https://github.com/rust-lang/rust/issues/137217.
2025-02-23Save pre-link bitcode to `ModuleCodegen`DianQK-1/+1
2025-02-19Rework `OperandRef::extract_field` to stop calling `to_immediate_scalar` on ↵Scott McMurray-8/+1
things which are already immediates That means it stops trying to truncate things that are already `i1`s.
2025-02-19Emit `trunc nuw` for unchecked shifts and `to_immediate_scalar`Scott McMurray-0/+11
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information - Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
2025-02-19Emit getelementptr inbounds nuw for pointer::add()Nikita Popov-0/+8
2025-02-18Auto merge of #133852 - x17jiri:cold_path, r=saethlinbors-0/+14
improve cold_path() #120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely` However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this: ``` if let Some(x) = y { ... } ``` may generate 3-target switch: ``` switch y.discriminator: 0 => true branch 1 = > false branch _ => unreachable ``` and therefore marking a branch as cold will have no effect. This PR improves `cold_path()` to work with arbitrary switch instructions. Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
2025-02-17improve cold_path()Jiri Bobek-0/+14
2025-02-14Auto merge of #136575 - scottmcm:nsuw-math, r=nikicbors-6/+29
Set both `nuw` and `nsw` in slice size calculation There's an old note in the code to do this, and now that [LLVM-C has an API for it](https://github.com/llvm/llvm-project/blob/f0b8ff12519270adcfef93410abff76ab073476a/llvm/include/llvm-c/Core.h#L4403-L4408), we might as well. And it's been there since what looks like LLVM 17 https://github.com/llvm/llvm-project/commit/de9b6aa341d8951625d62ae3dac8670ebb3eb006 so doesn't even need to be conditional. (There's other places, like `RawVecInner` or `Layout`, that might want to do things like this too, but I'll leave those for a future PR.)
2025-02-13Set both `nuw` and `nsw` in slice size calculationScott McMurray-6/+29
There's an old note in the code to do this, and now that LLVM-C has an API for it, we might as well.
2025-02-12`transmute` should also assume non-null pointersScott McMurray-0/+13
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
2025-02-07compiler: remove reexports from rustc_target::callconvJubilee Young-2/+2
2025-02-07compiler: remove rustc_target::abi entirelyJubilee Young-1/+1
2025-01-31Override `disjoint_or` in the LLVM backendScott McMurray-0/+5
2025-01-30Use ExistentialTraitRef throughout codegenMichael Goulet-4/+4
2025-01-24Rollup merge of #135581 - EnzymeAD:refactor-codgencx, r=oli-obkMatthias Krüger-2/+0
Separate Builder methods from tcx As part of the autodiff upstreaming we noticed, that it would be nice to have various builder methods available without the TypeContext, which prevents the normal CodegenCx to be passed around between threads. We introduce a SimpleCx which just owns the llvm module and llvm context, to encapsulate them. The previous CodegenCx now implements deref and forwards access to the llvm module or context to it's SimpleCx sub-struct. This gives us a bit more flexibility, because now we can pass (or construct) the SimpleCx in locations where we don't have enough information to construct a CodegenCx, or are not able to pass it around due to the tcx lifetimes (and it not implementing send/sync). This also introduces an SBuilder, similar to the SimpleCx. The SBuilder uses a SimpleCx, whereas the existing Builder uses the larger CodegenCx. I will push updates to make implementations generic (where possible) to be implemented once and work for either of the two. I'll also clean up the leftover code. `call` is a bit tricky, because it requires a tcx, I probably need to duplicate it after all. Tracking: - https://github.com/rust-lang/rust/issues/124509
2025-01-24Make CodegenCx and Builder genericManuel Drehwald-2/+0
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
2025-01-22Auto merge of #135674 - scottmcm:assume-better, r=estebankbors-0/+21
Update our range `assume`s to the format that LLVM prefers I found out in https://github.com/llvm/llvm-project/issues/123278#issuecomment-2597440158 that the way I started emitting the `assume`s in #109993 was suboptimal, and as seen in that LLVM issue the way we're doing it -- with two `assume`s sometimes -- can at times lead to CVP/SCCP not realize what's happening because one of them turns into a `ne` instead of conveying a range. So this updates how it's emitted from ``` assume( x >= LOW ); assume( x <= HIGH ); ``` or ``` // (for ranges that wrap the range) assume( (x <= LOW) | (x >= HIGH) ); ``` to ``` assume( (x - LOW) <= (HIGH - LOW) ); ``` so that we don't need multiple `icmp`s nor multiple `assume`s for a single value, and both wrappping and non-wrapping ranges emit the same shape. (And we don't bother emitting the subtraction if `LOW` is zero, since that's trivial for us to check too.)
2025-01-21Treat undef bytes as equal to any other byteOli Scherer-0/+1
2025-01-17Update our range `assume`s to the format that LLVM prefersScott McMurray-0/+21
2025-01-01upstream rustc_codegen_llvm changes for enzyme/autodiffManuel Drehwald-0/+9
2024-12-25rename typed_swap → typed_swap_nonoverlappingRalf Jung-1/+1
2024-12-18Re-export more `rustc_span::symbol` things from `rustc_span`.Nicholas Nethercote-1/+1
`rustc_span::symbol` defines some things that are re-exported from `rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some closely related things such as `Ident` and `kw`. So you can do `use rustc_span::{Symbol, sym}` but you have to do `use rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good reason. This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`, and changes many `rustc_span::symbol::` qualifiers in `compiler/` to `rustc_span::`. This is a 200+ net line of code reduction, mostly because many files with two `use rustc_span` items can be reduced to one.
2024-12-13Auto merge of #133099 - RalfJung:forbidden-hardfloat-features, r=workingjubileebors-1/+3
forbid toggling x87 and fpregs on hard-float targets Part of https://github.com/rust-lang/rust/issues/116344, follow-up to https://github.com/rust-lang/rust/pull/129884: The `x87` target feature on x86 and the `fpregs` target feature on ARM must not be disabled on a hardfloat target, as that would change the float ABI. However, *enabling* `fpregs` on ARM is [explicitly requested](https://github.com/rust-lang/rust/issues/130988) as it seems to be useful. Therefore, we need to refine the distinction of "forbidden" target features and "allowed" target features: all (un)stable target features can determine on a per-target basis whether they should be allowed to be toggled or not. `fpregs` then checks whether the current target has the `soft-float` feature, and if yes, `fpregs` is permitted -- otherwise, it is not. (Same for `x87` on x86). Also fixes https://github.com/rust-lang/rust/issues/132351. Since `fpregs` and `x87` can be enabled on some builds and disabled on others, it would make sense that one can query it via `cfg`. Therefore, I made them behave in `cfg` like any other unstable target feature. The first commit prepares the infrastructure, but does not change behavior. The second commit then wires up `fpregs` and `x87` with that new infrastructure. r? `@workingjubilee`
2024-12-11Auto merge of #128004 - folkertdev:naked-fn-asm, r=Amanieubors-0/+7
codegen `#[naked]` functions using global asm tracking issue: https://github.com/rust-lang/rust/issues/90957 Fixes #124375 This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc). I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about. Combined with https://github.com/rust-lang/rust/pull/127853, if both accepted, I think that resolves all steps from the tracking issue. r? `@Amanieu`
2024-12-11generalize 'forbidden feature' concept so that even (un)stable feature can ↵Ralf Jung-1/+3
be invalid to toggle Also rename some things for extra clarity
2024-12-10codegen `#[naked]` functions using `global_asm!`Folkert-0/+7
2024-12-06Remove all threading through of ErrorGuaranteed from the driverbjorn3-8/+2
It was inconsistently done (sometimes even within a single function) and most of the rest of the compiler uses fatal errors instead, which need to be caught using catch_with_exit_code anyway. Using fatal errors instead of ErrorGuaranteed everywhere in the driver simplifies things a bit.
2024-11-27use intra-doc links for CodegenBackend::linkMonadic Cat-1/+1
2024-11-27update comment (codegen_backend -> codegen_crate)Monadic Cat-2/+2
use intra-doc links so there'll be a doc gen fail next time this becomes wrong
2024-11-19move `fn is_item_raw` to `TypingEnv`lcnr-3/+3
2024-11-18use `TypingEnv` when no `infcx` is availablelcnr-7/+9
the behavior of the type system not only depends on the current assumptions, but also the currentnphase of the compiler. This is mostly necessary as we need to decide whether and how to reveal opaque types. We track this via the `TypingMode`.
2024-11-17Likely unlikely fixJiri Bobek-0/+20
2024-11-12Rollup merge of #132702 - 1c3t3a:issue-132615, r=rcvalleMatthias Krüger-0/+1
CFI: Append debug location to CFI blocks Currently we're not appending debug locations to the inserted CFI blocks. This shows up in #132615 and #100783. This change fixes that by passing down the debug location to the CFI type-test generation and appending it to the blocks. Credits also belong to `@jakos-sec` who worked with me on this.
2024-11-11CFI: Append debug location to CFI blocksBastian Kersting-0/+1
2024-11-09Pass owned CodegenResults to link_binarybjorn3-1/+1
After link_binary the temporary files referenced by CodegenResults are deleted, so calling link_binary again with the same CodegenResults should not be allowed.
2024-11-09Add a default implementation for CodegenBackend::linkbjorn3-1/+5
As a side effect this should add raw-dylib support to cg_gcc as the default ArchiveBuilderBuilder that is used implements create_dll_import_lib. I haven't tested if the raw-dylib support actually works however.
2024-11-03compiler: Directly use rustc_abi in codegenJubilee Young-8/+8
2024-10-29compiler: `rustc_abi::Abi` => `BackendRepr`Jubilee Young-2/+2
The initial naming of "Abi" was an awful mistake, conveying wrong ideas about how psABIs worked and even more about what the enum meant. It was only meant to represent the way the value would be described to a codegen backend as it was lowered to that intermediate representation. It was never meant to mean anything about the actual psABI handling! The conflation is because LLVM typically will associate a certain form with a certain ABI, but even that does not hold when the special cases that actually exist arise, plus the IR annotations that modify the ABI. Reframe `rustc_abi::Abi` as the `BackendRepr` of the type, and rename `BackendRepr::Aggregate` as `BackendRepr::Memory`. Unfortunately, due to the persistent misunderstandings, this too is now incorrect: - Scattered ABI-relevant code is entangled with BackendRepr - We do not always pre-compute a correct BackendRepr that reflects how we "actually" want this value to be handled, so we leave the backend interface to also inject various special-cases here - In some cases `BackendRepr::Memory` is a "real" aggregate, but in others it is in fact using memory, and in some cases it is a scalar! Our rustc-to-backend lowering code handles this sort of thing right now. That will eventually be addressed by lifting duplicated lowering code to either rustc_codegen_ssa or rustc_target as appropriate.
2024-10-25coverage: SSA doesn't need to know about `instrprof_increment`Zalathar-8/+0
2024-09-24Auto merge of #130389 - Luv-Ray:LLVMMDNodeInContext2, r=nikicbors-3/+5
llvm: replace some deprecated functions `LLVMMDStringInContext` and `LLVMMDNodeInContext` are deprecated, replace them with `LLVMMDStringInContext2` and `LLVMMDNodeInContext2`. Also replace `Value` with `Metadata` in some function signatures for better consistency.
2024-09-22Reformat using the new identifier sorting from rustfmtMichael Goulet-7/+7