about summary refs log tree commit diff
path: root/library/stdarch
AgeCommit message (Collapse)AuthorLines
2024-07-25std_detect: Add aarch64/linux/LLVM featuresKajetan Puchalski-9/+190
Add detection for various aarch64 CPU features already supported by LLVM and Linux. This commit adds feature detection for the following features: - FEAT_CSSC - FEAT_ECV - FEAT_FAMINMAX - FEAT_FLAGM2 - FEAT_FP8 - FEAT_FP8DOT2 - FEAT_FP8DOT4 - FEAT_FP8FMA - FEAT_HBC - FEAT_LSE128 - FEAT_LUT - FEAT_MOPS - FEAT_LRCPC3 - FEAT_SVE_B16B16 - FEAT_SVE2p1 - FEAT_WFxT It also adds feature detection for FEAT_FPMR. It is somewhat of a special case because FPMR only exists as a feature in LLVM 18, it has been removed from the LLVM upstream. On that account the intention is for it to be detectable at runtime through stdarch but not have a corresponding compile-time Rust target feature. Linux features: https://github.com/torvalds/linux/blob/master/arch/arm64/include/uapi/asm/hwcap.h LLVM features: llvm-project/llvm/lib/Target/AArch64/AArch64.td
2024-07-14Use LLVM intrinsics for masked load/stores, expand-loads and fp-classsayantn-1240/+479
Also, remove some redundant sse target-features from avx intrinsics
2024-07-14Revert "wasm32: Add `simd128` to enabled features for relaxed intrinsics"daxpedda-20/+20
2024-07-12Some small refactoringssayantn-3417/+13
Use llvm intrinsics for `vfpclassss` and `vfpclasssd` Use `simd_insert` for `x86_polyfill`
2024-07-11wasm32: Add `simd128` to enabled features for relaxed intrinsicsAlex Crichton-20/+20
It looks like LLVM requires that `simd128` is active to use these intrinsics and `relaxed-simd` isn't implicitly enabling them. This is probably something to fix at the LLVM layer as well but for now enable both the `simd128` feature as well as the `relaxed-simd` feature to fix things on our side.
2024-07-08Added verification for doc commentssayantn-0/+40
2024-07-08Fix Documentationsayantn-133/+175
2024-07-07Implement missing in SSE4a and TBMsayantn-258/+310
Add `extracti`, `inserti` and `bextri` intrinsics. Refactor TBM into 2 modules
2024-07-07Use generic simd in wasm intrinsicsTobias Decking-47/+17
2024-07-06Implemented runtime detection of `xop` target-featuresayantn-2/+8
2024-07-06Added runtime detectionsayantn-20/+61
Cannot do a `cupid` test because they don't support `amx`.
2024-07-06Refactor avx512bw: reduction operationsTobias Decking-74/+1184
2024-07-06Refactor avx512bw: mask operationsTobias Decking-30/+447
2024-07-06Refactor avx512bw: integer comparisonTobias Decking-134/+348
2024-07-06Refactor avx512bw: max/minTobias Decking-26/+24
2024-07-06Refactor avx512bw: saturating arithmeticTobias Decking-284/+106
2024-07-06Refactor avx512bw: avg + mulhi + absTobias Decking-24/+160
2024-07-06Add detection for SHA512, SM3 and SM4sayantn-2/+38
Cannot cross-verify with `cupid` because they do not have these features yet.
2024-07-06Added a `bf16` typesayantn-21/+52
2024-07-06Implemented some missing functionssayantn-13/+176
These cannot be linked with LLVM because of the lack of `bfloat16` and `i1` types in Rust. So, inline asm was the only way
2024-07-06Implemented the missing AVX512BF16 intrinsicssayantn-16/+245
2024-07-06Implemented VEX versionssayantn-65/+1198
Modified stdarch-test to accept VEX versions
2024-07-06Implemented missing gather-scatterssayantn-109/+1981
2024-07-06Fix the stream intrinsicssayantn-59/+82
They should use a platform-specific address management.
2024-07-02Fix incorrect reduction operations in avx512fTobias Decking-31/+10
2024-06-30Added support for AMD verificationsayantn-73/+90
Added a custom cpuid file for sde, which enables SSE4a, XOP, TBM and VP2INTERSECT. Fixed `xsave` tests
2024-06-30Updates SDEsayantn-25/+25
Updated SDE to v9.33.0 Disabled `assert-instr` in emulated run
2024-06-30Define remaining IFMA intrinsicsTobias Decking-48/+429
2024-06-30Use generic simd for avx512 leading zerosTobias Decking-20/+6
2024-06-30Refactor avx512f: mask operationsTobias Decking-16/+196
2024-06-30Refactor avx512f: element extractionTobias Decking-4/+35
2024-06-30Refactor avx512f: floating point absTobias Decking-13/+5
2024-06-30Refactor avx512f: zeroing primitivesTobias Decking-5/+5
2024-06-30Refactor avx512f: integer comparisonTobias Decking-196/+384
2024-06-30Refactor avx512f: integersTobias Decking-84/+215
2024-06-30Refactor avx512f: sqrt + rounding fixTobias Decking-98/+120
2024-06-30Refactor avx512f: rounding fmaTobias Decking-387/+174
2024-06-30Refactor avx512f: fmaTobias Decking-386/+216
2024-06-29Remove `has_cpuid`Jubilee Young-93/+2
2024-06-29Fixing CIsayantn-35/+19
Fixed x86_64-apple-darwin freezing. Bump all docker to Ubuntu-24.04 (except for emulated and armv7)
2024-06-29Some fixes as asked by @Amanieusayantn-20/+14
2024-06-29Fixed `_mm512_kunpackb`, reduce-max and reduce-minsayantn-28/+32
`_mm512_kunpackb` was implemented wrong, and `simd_reduce_max` uses `maxnum` for comparison, which adheres to IEEE754, but Intel specifically says that they do NOT adhere to IEEE754 for NaNs, which can give wrong results
2024-06-29Update CI to accommodate for windows-gnu targetssayantn-3/+4
2024-06-29Add the missing BMI1, SSE2, SSE4.1 and AVX2 intrinsicssayantn-45/+232
2024-06-29Fixed some more intrinsicssayantn-368/+333
Added some tests, Fixed incorrect target-features, and verification code for target-features. Removed all MMX support from verification.
2024-06-29Fixed many intrinsicssayantn-300/+335
fixed reduce-add and reduce-mul. and load/store of mask32 and mask64. added preserves-flags to mov asm. fixed the missing list. fixed `_mm_loadu_si64`. Added `assert_instr`
2024-06-29Upgraded disassembly to include `windows-gnu` targetssayantn-109/+80
2024-06-29Update Intrinsics listsayantn-147291/+179620
Updated the intrinsics list from version 3.4 to 3.6.8. Added a missing-x86.md file to track progress.
2024-06-27Fix documentation of arguments of function `core::arch::x86::_mm_blendv_epi8`Mathilda-2/+2
2024-06-27Fix _mm256_bsrli_epi128 producing invalid lower lane when IMM8 = 15Jayesskay-1/+1