about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src
AgeCommit message (Collapse)AuthorLines
2023-08-08feat: `riscv-interrupt-{m,s}` calling conventionsSeth Pellegrino-2/+7
Similar to prior support added for the mips430, avr, and x86 targets this change implements the rough equivalent of clang's [`__attribute__((interrupt))`][clang-attr] for riscv targets, enabling e.g. ```rust static mut CNT: usize = 0; pub extern "riscv-interrupt-m" fn isr_m() { unsafe { CNT += 1; } } ``` to produce highly effective assembly like: ```asm pub extern "riscv-interrupt-m" fn isr_m() { 420003a0: 1141 addi sp,sp,-16 unsafe { CNT += 1; 420003a2: c62a sw a0,12(sp) 420003a4: c42e sw a1,8(sp) 420003a6: 3fc80537 lui a0,0x3fc80 420003aa: 63c52583 lw a1,1596(a0) # 3fc8063c <_ZN12esp_riscv_rt3CNT17hcec3e3a214887d53E.0> 420003ae: 0585 addi a1,a1,1 420003b0: 62b52e23 sw a1,1596(a0) } } 420003b4: 4532 lw a0,12(sp) 420003b6: 45a2 lw a1,8(sp) 420003b8: 0141 addi sp,sp,16 420003ba: 30200073 mret ``` (disassembly via `riscv64-unknown-elf-objdump -C -S --disassemble ./esp32c3-hal/target/riscv32imc-unknown-none-elf/release/examples/gpio_interrupt`) This outcome is superior to hand-coded interrupt routines which, lacking visibility into any non-assembly body of the interrupt handler, have to be very conservative and save the [entire CPU state to the stack frame][full-frame-save]. By instead asking LLVM to only save the registers that it uses, we defer the decision to the tool with the best context: it can more accurately account for the cost of spills if it knows that every additional register used is already at the cost of an implicit spill. At the LLVM level, this is apparently [implemented by] marking every register as "[callee-save]," matching the semantics of an interrupt handler nicely (it has to leave the CPU state just as it found it after its `{m|s}ret`). This approach is not suitable for every interrupt handler, as it makes no attempt to e.g. save the state in a user-accessible stack frame. For a full discussion of those challenges and tradeoffs, please refer to [the interrupt calling conventions RFC][rfc]. Inside rustc, this implementation differs from prior art because LLVM does not expose the "all-saved" function flavor as a calling convention directly, instead preferring to use an attribute that allows for differentiating between "machine-mode" and "superivsor-mode" interrupts. Finally, some effort has been made to guide those who may not yet be aware of the differences between machine-mode and supervisor-mode interrupts as to why no `riscv-interrupt` calling convention is exposed through rustc, and similarly for why `riscv-interrupt-u` makes no appearance (as it would complicate future LLVM upgrades). [clang-attr]: https://clang.llvm.org/docs/AttributeReference.html#interrupt-risc-v [full-frame-save]: https://github.com/esp-rs/esp-riscv-rt/blob/9281af2ecffe13e40992917316f36920c26acaf3/src/lib.rs#L440-L469 [implemented by]: https://github.com/llvm/llvm-project/blob/b7fb2a3fec7c187d58a6d338ab512d9173bca987/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp#L61-L67 [callee-save]: https://github.com/llvm/llvm-project/blob/973f1fe7a8591c7af148e573491ab68cc15b6ecf/llvm/lib/Target/RISCV/RISCVCallingConv.td#L30-L37 [rfc]: https://github.com/rust-lang/rfcs/pull/3246
2023-08-08Auto merge of #114439 - Kobzol:remark-pgo-hotness, r=tmiaskobors-0/+3
Add hotness data to LLVM remarks Slight improvement of https://github.com/rust-lang/rust/pull/113040. This makes sure that if PGO is used, remarks generated using `-Zremark-dir` will include the `Hotness` attribute. r? `@tmiasko`
2023-08-08Only enable hotness information when PGO is availableJakub Beránek-0/+3
2023-08-07Update powerpc data layoutsNikita Popov-0/+11
Function pointer alignment is specified since https://reviews.llvm.org/D147016.
2023-08-07Rollup merge of #114382 - scottmcm:compare-bytes-intrinsic, r=cjgillotMatthias Krüger-1/+12
Add a new `compare_bytes` intrinsic instead of calling `memcmp` directly As discussed in #113435, this lets the backends be the place that can have the "don't call the function if n == 0" logic, if it's needed for the target. (I didn't actually *add* those checks, though, since as I understood it we didn't actually need them on known targets?) Doing this also let me make it `const` (unstable), which I don't think `extern "C" fn memcmp` can be. cc `@RalfJung` `@Amanieu`
2023-08-06Apply suggestions from code reviewscottmcm-0/+1
Co-authored-by: Ralf Jung <post@ralfj.de>
2023-08-06Add a new `compare_bytes` intrinsic instead of calling `memcmp` directlyScott McMurray-1/+11
2023-08-06Generate better function argument names in global_allocator expansionDavid Tolnay-2/+2
2023-08-04Rollup merge of #114450 - chenyukang:yukang-fix-114435, r=compiler-errorsMatthias Krüger-1/+1
Fix ICE failed to get layout for ReferencesError Fixes #114435 r? `@compiler-errors`
2023-08-05Fix ICE failed to get layout for ReferencesErroryukang-1/+1
2023-08-04Auto merge of #114350 - erikdesjardins:ident, r=tmiaskobors-3/+0
cg_llvm: stop identifying ADTs in LLVM IR This is an extension of https://github.com/rust-lang/rust/pull/94107. It may be a minor perf win. Fixes #96242. Now that we use opaque pointers, ADTs can no longer be recursive, so we do not need to name them. Previously, this would be necessary if you had a struct like ```rs struct Foo(Box<Foo>, u64, u64); ``` which would be represented with something like ```ll %Foo = type { %Foo*, i64, i64 } ``` which is now just ```ll { ptr, i64, i64 } ``` r? `@tmiasko`
2023-08-03Forbid old-style `simd_shuffleN` intrinsicsOli Scherer-21/+13
2023-08-02Rollup merge of #114079 - compiler-errors:closure-upvars, r=oli-obkNilstrieb-11/+4
Use `upvar_tys` in more places, make it return a list Just a cleanup that fell out of a PR that I was gonna write, but that PR kinda got stuck.
2023-08-02coverage: Consolidate FFI types into one moduleZalathar-208/+201
Coverage FFI types were historically split across two modules, because some of them were needed by code in `rustc_codegen_ssa`. Now that all of the coverage codegen code has been moved into `rustc_codegen_llvm` (#113355), it's possible to move all of the FFI types into a single module, making it easier to see all of them at once.
2023-08-01Use upvar_tys in more places, make it a listMichael Goulet-11/+4
2023-08-01Auto merge of #113339 - lqd:respect-filters, r=tmiaskobors-24/+17
Filter out short-lived LLVM diagnostics before they reach the rustc handler During profiling I saw remark passes being unconditionally enabled: for example `Machine Optimization Remark Emitter`. The diagnostic remarks enabled by default are [from missed optimizations and opt analyses](https://github.com/rust-lang/rust/pull/113339#discussion_r1259480303). They are created by LLVM, passed to the diagnostic handler on the C++ side, emitted to rust, where they are unpacked, C++ strings are converted to rust, etc. Then they are discarded in the vast majority of the time (i.e. unless some kind of `-Cremark` has enabled some of these passes' output to be printed). These unneeded allocations are very short-lived, basically only lasting between the LLVM pass emitting them and the rust handler where they are discarded. So it doesn't hugely impact max-rss, and is only a slight reduction in instruction count (cachegrind reports a reduction between 0.3% and 0.5%) _on linux_. It's possible that targets without `jemalloc` or with a worse allocator, may optimize these less. It is however significant in the aggregate, looking at the total number of allocated bytes: - it's the biggest source of allocations according to dhat, on the benchmarks I've tried e.g. `syn` or `cargo` - allocations on `syn` are reduced by 440MB, 17% (from 2440722647 bytes total, to 2030461328 bytes) - allocations on `cargo` are reduced by 6.6GB, 19% (from 35371886402 bytes total, to 28723987743 bytes) Some of these diagnostics objects [are allocated in LLVM](https://github.com/rust-lang/rust/pull/113339#discussion_r1252387484) *before* they're emitted to our diagnostic handler, where they'll be filtered out. So we could remove those in the future, but that will require changing a few LLVM call-sites upstream, so I left a FIXME.
2023-08-01remove remark filtering on the rust sideRémy Rakic-23/+16
now that remarks are filtered before cg_llvm's diagnostic handler callback is called, we don't need to do the filtering post c++-to-rust conversion of the diagnostic.
2023-08-01Auto merge of #105545 - erikdesjardins:ptrclean, r=bjorn3bors-377/+152
cleanup: remove pointee types This can't be merged until the oldest LLVM version we support uses opaque pointers, which will be the case after #114148. (Also note `-Cllvm-args="-opaque-pointers=0"` can technically be used in LLVM 15, though I don't think we should support that configuration.) I initially hoped this would provide some minor perf win, but in https://github.com/rust-lang/rust/pull/105412#issuecomment-1341224450 it had very little impact, so this is only valuable as a cleanup. As a followup, this will enable #96242 to be resolved. r? `@ghost` `@rustbot` label S-blocked
2023-08-01Make coverage counter IDs count up from 0, not 1Zalathar-27/+12
Operand types are now tracked explicitly, so there is no need to reserve ID 0 for the special always-zero counter. As part of the renumbering, this change fixes an off-by-one error in the way counters were counted by the `coverageinfo` query. As a result, functions should now have exactly the number of counters they actually need, instead of always having an extra counter that is never used.
2023-08-01Make coverage expression IDs count up from 0, not down from `u32::MAX`Zalathar-37/+16
Operand types are now tracked explicitly, so there is no need for expression IDs to avoid counter IDs by descending from `u32::MAX`. Instead they can just count up from 0, and can be used directly as indices when necessary.
2023-08-01Replace `ExpressionOperandId` with enum `Operand`Zalathar-27/+22
Because the three kinds of operand are now distinguished explicitly, we no longer need fiddly code to disambiguate counter IDs and expression IDs based on the total number of counters/expressions in a function. This does increase the size of operands from 4 bytes to 8 bytes, but that shouldn't be a big deal since they are mostly stored inside boxed structures, and the current coverage code is not particularly size-optimized anyway.
2023-07-31Auto merge of #113879 - nnethercote:codegen_ssa-cleanups, r=bjorn3bors-19/+7
`codegen_ssa` cleanups Some clarifications I made when reading this code closely. r? `@tmiasko`
2023-07-31Use standard Rust capitalization rules for names containing "LTO".Nicholas Nethercote-7/+7
2023-07-31Remove `ExtraBackendMethods::spawn_thread`.Nicholas Nethercote-12/+0
It's no longer used, and `spawn_named_thread` is preferable, because naming threads is helpful when profiling.
2023-07-31Auto merge of #114266 - calebzulawski:simd-bswap, r=compiler-errorsbors-11/+10
Fix simd_bswap for i8/u8 #114156 missed this test case ☹️ cc `@workingjubilee`
2023-07-30Fix simd_bswap for i8/u8Caleb Zulawski-11/+10
2023-07-30inline format!() args up to and including rustc_codegen_llvmMatthias Krüger-52/+43
2023-07-29cg_llvm: simplify llvm.masked.gather/scatter naming with opaque pointersErik Desjardins-105/+47
With opaque pointers, there's no longer a need to generate a chain of pointer types in the intrinsic name when arguments are pointers to pointers.
2023-07-29cg_llvm: clean up matchErik Desjardins-1/+1
2023-07-29cg_llvm: inline check_storeErik Desjardins-10/+2
2023-07-29cg_llvm: stop identifying ADTs in LLVM IRErik Desjardins-3/+0
Now that we use opaque pointers, ADTs can no longer be recursive, so we do not need to name them. Previously, this would be necessary if you had a struct like ```rs struct Foo(Box<Foo>, u64, u64); ``` which would be represented with something like ```ll %Foo = type { %Foo*, i64, i64 } ``` which is now just ```ll { ptr, i64, i64 } ```
2023-07-29Auto merge of #114156 - calebzulawski:simd-bswap, r=compiler-errorsbors-0/+47
Add simd_bswap, simd_bitreverse, simd_ctlz, and simd_cttz intrinsics cc `@workingjubilee`
2023-07-29cg_ssa: remove pointee types and pointercast/bitcast-of-ptrErik Desjardins-21/+5
2023-07-29cg_llvm: remove pointee types and pointercast/bitcast-of-ptrErik Desjardins-260/+117
2023-07-28Use i1 instead of boolCaleb Zulawski-2/+9
2023-07-27Add SIMD bitreverse, ctlz, cttz intrinsicsCaleb Zulawski-7/+25
2023-07-27Add simd_bswap intrinsicCaleb Zulawski-0/+22
2023-07-27Update the minimum external LLVM to 15Josh Stone-36/+28
2023-07-24coverage: Obtain the `__llvm_covfun` section name outside a per-function loopZalathar-8/+31
This section name is always constant for a given target, but obtaining it from LLVM requires a few intermediate allocations. There's no need to do so repeatedly from inside a per-function loop.
2023-07-21Revert "Auto merge of #113166 - moulins:ref-niches-initial, r=oli-obk"David Tolnay-2/+2
This reverts commit 557359f92512ca88b62a602ebda291f17a953002, reversing changes made to 1e6c09a803fd543a98bfbe1624d697a55300a786.
2023-07-21Support `.comment` section like GCC/Clang (`!llvm.ident`)Miguel Ojeda-15/+18
Both GCC and Clang write by default a `.comment` section with compiler information: ```txt $ gcc -c -xc /dev/null && readelf -p '.comment' null.o String dump of section '.comment': [ 1] GCC: (GNU) 11.2.0 $ clang -c -xc /dev/null && readelf -p '.comment' null.o String dump of section '.comment': [ 1] clang version 14.0.1 (https://github.com/llvm/llvm-project.git c62053979489ccb002efe411c3af059addcb5d7d) ``` They also implement the `-Qn` flag to avoid doing so: ```txt $ gcc -Qn -c -xc /dev/null && readelf -p '.comment' null.o readelf: Warning: Section '.comment' was not dumped because it does not exist! $ clang -Qn -c -xc /dev/null && readelf -p '.comment' null.o readelf: Warning: Section '.comment' was not dumped because it does not exist! ``` So far, `rustc` only does it for WebAssembly targets and only when debug info is enabled: ```txt $ echo 'fn main(){}' | rustc --target=wasm32-unknown-unknown --emit=llvm-ir -Cdebuginfo=2 - && grep llvm.ident rust_out.ll !llvm.ident = !{!27} ``` In the RFC part of this PR it was decided to always add the information, which gets us closer to other popular compilers. An opt-out flag like GCC and Clang may be added later on if deemed necessary. Implementation-wise, this covers both `ModuleLlvm::new()` and `ModuleLlvm::new_metadata()` cases by moving the addition to `context::create_module` and adds a few test cases. ThinLTO also sees the `llvm.ident` named metadata duplicated (in temporary outputs), so this deduplicates it like it is done for `wasm.custom_sections`. The tests also check this duplication does not take place. Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2023-07-21Auto merge of #113166 - moulins:ref-niches-initial, r=oli-obkbors-2/+2
Prototype: Add unstable `-Z reference-niches` option MCP: rust-lang/compiler-team#641 Relevant RFC: rust-lang/rfcs#3204 This prototype adds a new `-Z reference-niches` option, controlling the range of valid bit-patterns for reference types (`&T` and `&mut T`), thereby enabling new enum niching opportunities. Like `-Z randomize-layout`, this setting is crate-local; as such, references to built-in types (primitives, tuples, ...) are not affected. The possible settings are (here, `MAX` denotes the all-1 bit-pattern): | `-Z reference-niches=` | Valid range | |:---:|:---:| | `null` (the default) | `1..=MAX` | | `size` | `1..=(MAX- size)` | | `align` | `align..=MAX.align_down_to(align)` | | `size,align` | `align..=(MAX-size).align_down_to(align)` | ------ This is very WIP, and I'm not sure the approach I've taken here is the best one, but stage 1 tests pass locally; I believe this is in a good enough state to unleash this upon unsuspecting 3rd-party code, and see what breaks.
2023-07-21Rollup merge of #113780 - dtolnay:printkindpath, r=b-naberMatthias Krüger-36/+55
Support `--print KIND=PATH` command line syntax As is already done for `--emit KIND=PATH` and `-L KIND=PATH`. In the discussion of #110785, it was pointed out that `--print KIND=PATH` is nicer than trying to apply the single global `-o` path to `--print`'s output, because in general there can be multiple print requests within a single rustc invocation, and anyway `-o` would already be used for a different meaning in the case of `link-args` and `native-static-libs`. I am interested in using `--print cfg=PATH` in Buck2. Currently Buck2 works around the lack of support for `--print KIND=PATH` by [indirecting through a Python wrapper script](https://github.com/facebook/buck2/blob/d43cf3a51a31f00be2c2248e78271b0fef0452b4/prelude/rust/tools/get_rustc_cfg.py) to redirect rustc's stdout into the location dictated by the build system. From skimming Cargo's usages of `--print`, it definitely seems like it would benefit from `--print KIND=PATH` too. Currently it is working around the lack of this by inserting `--crate-name=___ --print=crate-name` so that it can look for a line containing `___` as a delimiter between the 2 other `--print` informations it actually cares about. This is commented as a "HACK" and "abuse". https://github.com/rust-lang/cargo/blob/31eda6f7c360d9911f853b3014e057db61238f3e/src/cargo/core/compiler/build_context/target_info.rs#L242 (FYI `@weihanglo` as you dealt with this recently in https://github.com/rust-lang/cargo/pull/11633.) Mentioning reviewers active in #110785: `@fee1-dead` `@jyn514` `@bjorn3`
2023-07-21Rollup merge of #113723 - khei4:khei4/llvm-stats, r=oli-obk,nikicMatthias Krüger-2/+31
Resurrect: rustc_llvm: Add a -Z `print-codegen-stats` option to expose LLVM statistics. This resurrects PR https://github.com/rust-lang/rust/pull/104000, which has sat idle for a while. And I want to see the effect of stack-move optimizations on LLVM (like https://reviews.llvm.org/D153453) :). I have applied the changes requested by `@oli-obk` and `@nagisa` https://github.com/rust-lang/rust/pull/104000#discussion_r1014625377 and https://github.com/rust-lang/rust/pull/104000#discussion_r1014642482 in the latest commits. r? `@oli-obk` ----- LLVM has a neat [statistics](https://llvm.org/docs/ProgrammersManual.html#the-statistic-class-stats-option) feature that tracks how often optimizations kick in. It's very handy for optimization work. Since we expose the LLVM pass timings, I thought it made sense to expose the LLVM statistics too. ----- (Edit: fix broken link (Edit2: fix segmentation fault and use malloc If `rustc` is built with ```toml [llvm] assertions = true ``` Then you can see like ``` rustc +stage1 -Z print-codegen-stats -C opt-level=3 tmp.rs ===-------------------------------------------------------------------------=== ... Statistics Collected ... ===-------------------------------------------------------------------------=== 3 aa - Number of MayAlias results 193 aa - Number of MustAlias results 531 aa - Number of NoAlias results ... ``` And the current default build emits only ``` $ rustc +stage1 -Z print-codegen-stats -C opt-level=3 tmp.rs ===-------------------------------------------------------------------------=== ... Statistics Collected ... ===-------------------------------------------------------------------------=== $ ``` This might be better to emit the message to tell assertion flag necessity, but now I can't find how to do that...
2023-07-21Don't treat ref. fields with non-null niches as `dereferenceable_or_null`Moulins-2/+2
2023-07-20Implement printing to file in PassWrapperDavid Tolnay-4/+21
2023-07-20Implement printing to file in llvm_utilDavid Tolnay-13/+14
2023-07-20Implement printing to file in codegen_backend.printDavid Tolnay-11/+12
2023-07-20Store individual output file name with every PrintRequestDavid Tolnay-13/+13
2023-07-20address feedback from nikic and oli-obk ↵khei4-18/+17
https://github.com/rust-lang/rust/pull/113723/files use slice memcpy rather than strcpy and write it on stdout use println on failure Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>