diff options
| author | bors <bors@rust-lang.org> | 2017-05-10 08:54:50 +0000 |
|---|---|---|
| committer | bors <bors@rust-lang.org> | 2017-05-10 08:54:50 +0000 |
| commit | 2b97174ada7fb1854269558ed2cf3b089e58beee (patch) | |
| tree | 880576d0244be04f587f845d2cb96936d213f3ff /src/libstd | |
| parent | 58b33ad70cdd11f9ce7b5874c6effab9627e51aa (diff) | |
| parent | da91361d2a8ea86a42cbe2a23a7ff816cc5500af (diff) | |
| download | rust-2b97174ada7fb1854269558ed2cf3b089e58beee.tar.gz rust-2b97174ada7fb1854269558ed2cf3b089e58beee.zip | |
Auto merge of #41764 - scottmcm:faster-reverse, r=brson
Make [u8]::reverse() 5x faster Since LLVM doesn't vectorize the loop for us, do unaligned reads of a larger type and use LLVM's bswap intrinsic to do the reversing of the actual bytes. cfg!-restricted to x86 and x86_64, as I assume it wouldn't help on things like ARMv5. Also makes [u16]::reverse() a more modest 1.5x faster by loading/storing u32 and swapping the u16s with ROT16. Thank you ptr::*_unaligned for making this easy :) Benchmark results (from my i5-2500K): ```text # Before test slice::reverse_u8 ... bench: 273,836 ns/iter (+/- 15,592) = 3829 MB/s test slice::reverse_u16 ... bench: 139,793 ns/iter (+/- 17,748) = 7500 MB/s test slice::reverse_u32 ... bench: 74,997 ns/iter (+/- 5,130) = 13981 MB/s test slice::reverse_u64 ... bench: 47,452 ns/iter (+/- 2,213) = 22097 MB/s # After test slice::reverse_u8 ... bench: 52,170 ns/iter (+/- 3,962) = 20099 MB/s test slice::reverse_u16 ... bench: 93,330 ns/iter (+/- 4,412) = 11235 MB/s test slice::reverse_u32 ... bench: 74,731 ns/iter (+/- 1,425) = 14031 MB/s test slice::reverse_u64 ... bench: 47,556 ns/iter (+/- 3,025) = 22049 MB/s ``` If you're curious about the assembly, instead of doing this ``` movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rsi] mov byte ptr [rdi], cl mov byte ptr [rsi], al ``` it does this ``` mov rax, qword ptr [rdx] mov rbx, qword ptr [r11 + rcx - 8] bswap rbx mov qword ptr [rdx], rbx bswap rax mov qword ptr [r11 + rcx - 8], rax ```
Diffstat (limited to 'src/libstd')
0 files changed, 0 insertions, 0 deletions
