about summary refs log tree commit diff
path: root/src/test/codegen
AgeCommit message (Collapse)AuthorLines
2020-10-10Revert "Assume slice len is bounded by allocation size"Simonas Kazlauskas-24/+0
https://github.com/rust-lang/rust/pull/77023#issuecomment-703987379 suggests that the original PR introduced a significant perf regression. This reverts commit e44784b8750016a695361c990024750e037d8f9f / #77023.
2020-10-07Add codegen testAnthonyMikh-0/+35
2020-10-05Auto merge of #77466 - Aaron1011:reland-drop-tree, r=matthewjasperbors-3/+3
Re-land PR #71840 (Rework MIR drop tree lowering) PR https://github.com/rust-lang/rust/pull/71840 was reverted in https://github.com/rust-lang/rust/pull/72989 to fix an LLVM error (https://github.com/rust-lang/rust/issues/72470). That LLVM error no longer occurs with the recent upgrade to LLVM 11 (https://github.com/rust-lang/rust/pull/73526), so let's try re-landing this PR. I've cherry-picked the commits from the original PR (with the exception of the commit blessing test output), making as few modifications as possible. I addressed the rebase fallout in separate commits on top of those. r? `@matthewjasper`
2020-10-05Pass tune-cpu to LLVMMingye Wang-0/+21
I think this is how it should work...
2020-10-04Auto merge of #77023 - HeroicKatora:len-missed-optimization, r=Mark-Simulacrumbors-0/+24
Hint the maximum length permitted by invariant of slices One of the safety invariants of references, and in particular of references to slices, is that they may not cover more than `isize::MAX` bytes. The unsafe `from_raw_parts` constructors of slices explicitly requires the caller to guarantee this fact. Violating it would also be UB with regards to the semantics of generated llvm code. This effectively bounds the length of a (non-ZST) slice from above by a compile time constant. But when the length is loaded from a function argument it appears llvm is not aware of this requirement. The additional value range assertions allow some further elision of code branches, including overflow checks, especially in the presence of artithmetic on the indices. This may have a performance impact, adding more code to a common method but allowing more optimization. I'm not quite sure, is the Rust side of const-prop strong enough to elide the irrelevant match branches? Fixes: #67186
2020-10-04Assume slice len is bounded by allocation sizeAndreas Molzer-0/+24
Uses assume to check the length against a constant upper bound. The inlined result then informs the optimizer of the sound value range. This was tried with unreachable_unchecked before which introduces a branch. This has the advantage of not being executed in sound code but complicates basic blocks. It resulted in ~2% increased compile time in some worst cases. Add a codegen test for the assumption, testing the issue from #67186
2020-10-04Defer creating drop trees in MIR lowering until leaving that scopeMatthew Jasper-3/+3
2020-10-02Auto merge of #77396 - wesleywiser:disable-simplifyarmidentity, r=oli-obkbors-1/+1
Disable the SimplifyArmIdentity mir-opt The optimization still has some bugs that need to be worked out such as #77359. We can try re-enabling this again after the known issues are resolved. r? `@oli-obk`
2020-10-01Disable the SimplifyArmIdentity mir-optWesley Wiser-1/+1
The optimization still has some bugs that need to be worked out such as #77359. We can try re-enabling this again after the known issues are resolved.
2020-10-01Auto merge of #76971 - bugadani:issue-75659, r=Amanieubors-0/+63
Refactor memchr to allow optimization Closes #75659 The implementation already uses naive search if the slice if short enough, but the case is complicated enough to not be optimized away. This PR refactors memchr so that it exists early when the slice is short enough. Codegen-wise, as shown in #75659, memchr was not inlined previously so the only way I could find to test this is to check if there is no memchr call. Let me know if there is a more robust solution here.
2020-10-01Only test on x86_64Dániel Buga-0/+1
2020-09-27Refactor memchr to allow optimizationDániel Buga-0/+62
2020-09-26Add a test for 128-bit return valuesJonas Schievink-0/+32
2020-09-25Auto merge of #73453 - erikdesjardins:tuplayout, r=eddybbors-0/+79
Ignore ZST offsets when deciding whether to use Scalar/ScalarPair layout This is important because Scalar/ScalarPair layout previously would not be used if any ZST had nonzero offset. For example, before this change, only `((), u128)` would be laid out like `u128`, not `(u128, ())`. Fixes #63244
2020-09-21Rollup merge of #76961 - bugadani:test-34634, r=Mark-SimulacrumRalf Jung-0/+16
Add test for issue #34634 Closes #34634
2020-09-21Auto merge of #76295 - mati865:remove-mmx, r=Amanieu,oli-obkbors-27/+0
Remove MMX from Rust Follow-up to https://github.com/rust-lang/stdarch/pull/890 This removes most of MMX from Rust (tests pass with small changes), keeping stable `is_x86_feature_detected!("mmx")` working.
2020-09-20Remove MMX from RustMateusz Mikuła-27/+0
2020-09-20Add test for issue #34634Dániel Buga-0/+16
2020-09-20Make codegen test bitwidth-independentOliver Scherer-2/+2
2020-09-19Update codegen testsOliver Scherer-3/+3
2020-09-14Test that bounds checks are elided for indexing after .[r]position()Erik Desjardins-0/+78
2020-08-31Add codegen testsDániel Buga-0/+56
2020-08-30handle vector layoutErik Desjardins-0/+14
2020-08-30add codegen testErik Desjardins-0/+29
2020-08-30add tests related to tuple scalar layout with ZST in last fieldErik Desjardins-0/+36
2020-08-30Rollup merge of #75910 - bugadani:testcase, r=oli-obkDylan DPC-0/+22
Add test for issue #27130 #27130 seems to be fixed by the llvm 11 update. The issue is marked with needs-test, so here it is. As some historical context, the generated code was fine until 1.38, and remained unoptimized from 1.38 up until the current nightly. I've also added a pattern matching version that was fine on 1.45.2.
2020-08-28Make sure the functions don't get mergedDániel Buga-1/+1
2020-08-25Add test for issue #27130Dániel Buga-0/+22
2020-08-24[AVR] Rename the last few remaining references from 'avr-unknown-unknown' to ↵Dylan McKay-1/+1
'avr-unknown-gnu-atmega328'
2020-08-22Match scalar-pair-bool more flexibly for LLVM 11Josh Stone-2/+3
LLVM 11 started using `phi` and `select` for `fn pair_i32_bool`, which is still valid, but harder to match than the simple instructions we were getting before. We'll just check that the unpacked args are directly referenced in any way, and call it good.
2020-08-18Don't panic in Vec::shrink_to_fitJosh Stone-0/+36
2020-08-11Revert "Suppress debuginfo on naked function arguments"Nathaniel McCallum-2/+2
This reverts commit 25670749b44a9c7a4cfd3fbf780bbe3344a9a6c5. This commit does not actually fix the problem. It merely removes the name of the argument from the LLVM output. Even without the name, Rust codegen still spills the (nameless) variable onto the stack which is the root cause. The root cause is solved in the next commit.
2020-08-11move Deaggregate pass to post_borrowck_cleanupRalf Jung-2/+2
2020-08-08Auto merge of #74533 - nikic:issue-74425, r=eddybbors-0/+25
Emit == null instead of <= null for niche check When the niche maximum is zero, emit a "== zero" check instead of a "<= zero" check. In particular, this avoids the awkward case of "<= null". While LLVM does canonicalize this to "== null", this apparently doesn't happen for constant expressions, leading to the issue in #74425. While that can be addressed on the LLVM side, it still seems prudent to emit sensible IR here, because this will allow null checks to be optimized earlier in the pipeline. Fixes #74425.
2020-08-08Emit == null instead of <= nullNikita Popov-0/+25
When the niche maximum is zero, emit a "== zero" check instead of a "<= zero" check. In particular, this avoid the awkward case of "<= null". While LLVM does canonicalize this to "!= null", this appently doesn't happen for constant expressions, leading to the issue in #74425. While that can be addressed on the LLVM side, it still seems prudent to emit sensible IR here, because this will allow null checks to be optimized earlier in the pipeline. Fixes #74425.
2020-08-03Auto merge of #74695 - alexcrichton:more-wasm-float-cast-fixes, r=nagisabors-24/+24
rustc: Improving safe wasm float->int casts This commit improves code generation for WebAssembly targets when translating floating to integer casts. This improvement is only relevant when the `nontrapping-fptoint` feature is not enabled, but the feature is not enabled by default right now. Additionally this improvement only affects safe casts since unchecked casts were improved in #74659. Some more background for this issue is present on #73591, but the general gist of the issue is that in LLVM the `fptosi` and `fptoui` instructions are defined to return an `undef` value if they execute on out-of-bounds values; they notably do not trap. To implement these instructions for WebAssembly the LLVM backend must therefore generate quite a few instructions before executing `i32.trunc_f32_s` (for example) because this WebAssembly instruction traps on out-of-bounds values. This codegen into wasm instructions happens very late in the code generator, so what ends up happening is that rustc inserts its own codegen to implement Rust's saturating semantics, and then LLVM also inserts its own codegen to make sure that the `fptosi` instruction doesn't trap. Overall this means that a function like this: #[no_mangle] pub unsafe extern "C" fn cast(x: f64) -> u32 { x as u32 } will generate this WebAssembly today: (func $cast (type 0) (param f64) (result i32) (local i32 i32) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.gt local.set 1 block ;; label = @1 block ;; label = @2 local.get 0 f64.const 0x0p+0 (;=0;) local.get 0 f64.const 0x0p+0 (;=0;) f64.gt select local.tee 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@2;) local.get 0 i32.trunc_f64_u local.set 2 br 1 (;@1;) end i32.const 0 local.set 2 end i32.const -1 local.get 2 local.get 1 select) This PR improves the situation by updating the code generation for float-to-int conversions in rustc, specifically only for WebAssembly targets and only for some situations (float-to-u8 still has not great codegen). The fix here is to use basic blocks and control flow to avoid speculatively executing `fptosi`, and instead LLVM's raw intrinsic for the WebAssembly instruction is used instead. This effectively extends the support added in #74659 to checked casts. After this commit the codegen for the above Rust function looks like: (func $cast (type 0) (param f64) (result i32) (local i32) block ;; label = @1 local.get 0 f64.const 0x0p+0 (;=0;) f64.ge local.tee 1 i32.const 1 i32.xor br_if 0 (;@1;) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.le i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const -1 i32.const 0 local.get 1 select) For reference, in Rust 1.44, which did not have saturating float-to-integer casts, the codegen LLVM would emit is: (func $cast (type 0) (param f64) (result i32) block ;; label = @1 local.get 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const 0) So we're relatively close to the original codegen, although it's slightly different because the semantics of the function changed where we're emulating the `i32.trunc_sat_f32_s` instruction rather than always replacing out-of-bounds values with zero. There is still work that could be done to improve casts such as `f32` to `u8`. That form of cast still uses the `fptosi` instruction which generates lots of branch-y code. This seems less important to tackle now though. In the meantime this should take care of most use cases of floating-point conversion and as a result I'm going to speculate that this... Closes #73591
2020-08-02compiletest: Support ignoring tests requiring missing LLVM componentsVadim Petrochenkov-2/+7
2020-07-30Auto merge of #74105 - npmccallum:naked, r=matthewjasperbors-2/+2
Suppress debuginfo on naked function arguments A function that has no prologue cannot be reasonably expected to support debuginfo. In fact, the existing code (before this patch) would generate invalid instructions that caused crashes. We can solve this easily by just not emitting the debuginfo in this case. Fixes https://github.com/rust-lang/rust/issues/42779 cc https://github.com/rust-lang/rust/issues/32408
2020-07-28rustc: Improving safe wasm float->int castsAlex Crichton-24/+24
This commit improves code generation for WebAssembly targets when translating floating to integer casts. This improvement is only relevant when the `nontrapping-fptoint` feature is not enabled, but the feature is not enabled by default right now. Additionally this improvement only affects safe casts since unchecked casts were improved in #74659. Some more background for this issue is present on #73591, but the general gist of the issue is that in LLVM the `fptosi` and `fptoui` instructions are defined to return an `undef` value if they execute on out-of-bounds values; they notably do not trap. To implement these instructions for WebAssembly the LLVM backend must therefore generate quite a few instructions before executing `i32.trunc_f32_s` (for example) because this WebAssembly instruction traps on out-of-bounds values. This codegen into wasm instructions happens very late in the code generator, so what ends up happening is that rustc inserts its own codegen to implement Rust's saturating semantics, and then LLVM also inserts its own codegen to make sure that the `fptosi` instruction doesn't trap. Overall this means that a function like this: #[no_mangle] pub unsafe extern "C" fn cast(x: f64) -> u32 { x as u32 } will generate this WebAssembly today: (func $cast (type 0) (param f64) (result i32) (local i32 i32) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.gt local.set 1 block ;; label = @1 block ;; label = @2 local.get 0 f64.const 0x0p+0 (;=0;) local.get 0 f64.const 0x0p+0 (;=0;) f64.gt select local.tee 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@2;) local.get 0 i32.trunc_f64_u local.set 2 br 1 (;@1;) end i32.const 0 local.set 2 end i32.const -1 local.get 2 local.get 1 select) This PR improves the situation by updating the code generation for float-to-int conversions in rustc, specifically only for WebAssembly targets and only for some situations (float-to-u8 still has not great codegen). The fix here is to use basic blocks and control flow to avoid speculatively executing `fptosi`, and instead LLVM's raw intrinsic for the WebAssembly instruction is used instead. This effectively extends the support added in #74659 to checked casts. After this commit the codegen for the above Rust function looks like: (func $cast (type 0) (param f64) (result i32) (local i32) block ;; label = @1 local.get 0 f64.const 0x0p+0 (;=0;) f64.ge local.tee 1 i32.const 1 i32.xor br_if 0 (;@1;) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.le i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const -1 i32.const 0 local.get 1 select) For reference, in Rust 1.44, which did not have saturating float-to-integer casts, the codegen LLVM would emit is: (func $cast (type 0) (param f64) (result i32) block ;; label = @1 local.get 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const 0) So we're relatively close to the original codegen, although it's slightly different because the semantics of the function changed where we're emulating the `i32.trunc_sat_f32_s` instruction rather than always replacing out-of-bounds values with zero. There is still work that could be done to improve casts such as `f32` to `u8`. That form of cast still uses the `fptosi` instruction which generates lots of branch-y code. This seems less important to tackle now though. In the meantime this should take care of most use cases of floating-point conversion and as a result I'm going to speculate that this... Closes #73591
2020-07-27Suppress debuginfo on naked function argumentsNathaniel McCallum-2/+2
A function that has no prologue cannot be reasonably expected to support debuginfo. In fact, the existing code (before this patch) would generate invalid instructions that caused crashes. We can solve this easily by just not emitting the debuginfo in this case. Fixes https://github.com/rust-lang/rust/issues/42779 cc https://github.com/rust-lang/rust/issues/32408
2020-07-25Auto merge of #74510 - LukasKalbertodt:fix-range-from-index-panic, ↵bors-3/+3
r=hanna-kruppe Fix panic message when `RangeFrom` index is out of bounds Before, the `Range` method was called with `end = slice.len()`. Unfortunately, because `Range::index` first checks the order of the indices (start has to be smaller than end), an out of bounds index leads to `core::slice::slice_index_order_fail` being called. This prints the message 'slice index starts at 27 but ends at 10', which is worse than 'index 27 out of range for slice of length 10'. This is not only useful to normal users reading panic messages, but also for people inspecting assembly and being confused by `slice_index_order_fail` calls. You can see the produced assembly [here](https://rust.godbolt.org/z/GzMGWf) and try on Playground [here](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=aada5996b2f3848075a6d02cf4055743). (By the way. this is only about which panic function is called; I'm pretty sure it does not improve anything about performance).
2020-07-22Improve codegen for unchecked float casts on wasmAlex Crichton-20/+9
This commit improves codegen for unchecked casts on WebAssembly targets to use the singluar `iNN.trunc_fMM_{u,s}` instructions. Previously rustc would codegen a bare `fptosi` and `fptoui` for float casts but for WebAssembly targets the codegen for these instructions is quite large. This large codegen is due to the fact that LLVM can speculate these instructions so the trapping behavior of WebAssembly needs to be protected against in case they're speculated. The change here is to update the codegen for the unchecked cast intrinsics to have a wasm-specific case where they call the appropriate LLVM intrinsic to generate the right wasm instruction. The intrinsic is explicitly opting-in to undefined behavior so a trap here for out-of-bounds inputs on wasm should be acceptable. cc #73591
2020-07-22Rollup merge of #74237 - lzutao:compiletest, r=Mark-SimulacrumManish Goregaokar-2/+2
compiletest: Rewrite extract_*_version functions This makes extract_lldb_version has the same version type like extract_gdb_version.
2020-07-22Rollup merge of #73893 - ajpaverd:cfguard-stabilize, r=nikomatsakisManish Goregaokar-4/+4
Stabilize control-flow-guard codegen option This is the stabilization PR discussed in #68793. It converts the `-Z control-flow-guard` debugging option into a codegen option (`-C control-flow-guard`), and changes the associated tests.
2020-07-22[AVR] Correctly set the pointer address space when constructing pointers to ↵Dylan McKay-0/+93
functions This patch extends the existing `type_i8p` method so that it requires an explicit address space to be specified. Before this patch, the `type_i8p` method implcitily assumed the default address space, which is not a safe transformation on all targets, namely AVR. The Rust compiler already has support for tracking the "instruction address space" on a per-target basis. This patch extends the code generation routines so that an address space must always be specified. In my estimation, around 15% of the callers of `type_i8p` produced invalid code on AVR due to the loss of address space prior to LLVM final code generation. This would lead to unavoidable assertion errors relating to invalid bitcasts. With this patch, the address space is always either 1) explicitly set to the instruction address space because the logic is dealing with functions which must be placed there, or 2) explicitly set to the default address space 0 because the logic can only operate on data space pointers and thus we keep the existing semantics of assuming the default, "data" address space.
2020-07-20Slightly improve panic messages when range indices are out of boundsLukas Kalbertodt-3/+3
2020-07-19Add missing : after *llvm-versionLzu Tao-2/+2
2020-07-17Test codegen of compare_exchange operationsTomasz Miąsko-0/+60
2020-07-16Rollup merge of #74171 - ehuss:44056-debug-macos, r=nikomatsakisManish Goregaokar-2/+3
Fix 44056 test with debug on macos. The test `codegen/issue-44056-macos-tls-align.rs` fails on macos if `debug-assertions` is enabled in `config.toml`. It has the following error: ``` /Users/eric/Proj/rust/rust/src/test/codegen/issue-44056-macos-tls-align.rs:9:11: error: CHECK: expected string not found in input // CHECK: @STATIC_VAR_1 = thread_local local_unnamed_addr global <{ [32 x i8] }> zeroinitializer, section "__DATA,__thread_bss", align 4 ^ /Users/eric/Proj/rust/rust/build/x86_64-apple-darwin/test/codegen/issue-44056-macos-tls-align/issue-44056-macos-tls-align.ll:1:1: note: scanning from here ; ModuleID = 'issue_44056_macos_tls_align.3a1fbbbh-cgu.0' ^ /Users/eric/Proj/rust/rust/build/x86_64-apple-darwin/test/codegen/issue-44056-macos-tls-align/issue-44056-macos-tls-align.ll:9:1: note: possible intended match here @STATIC_VAR_1 = thread_local global <{ [32 x i8] }> zeroinitializer, section "__DATA,__thread_bss", align 4 ^ ``` Comparing the output, the actual output is missing the text "`local_unnamed_addr`". The fix here is to ignore `local_unnamed_addr`, as it doesn't seem relevant to the test.
2020-07-16Rollup merge of #73926 - joaopaulocarreiro:github_rust-6, r=nikomatsakisManish Goregaokar-0/+1
Ignoring test case: [codegen] repr-transparent-aggregates-1.rs for aarch64 Ignoring test case: [codegen] repr-transparent-aggregates-1.rs for aarch64. Copyright (c) 2020, Arm Limited.