diff options
| author | bors <bors@rust-lang.org> | 2017-08-09 01:30:02 +0000 |
|---|---|---|
| committer | bors <bors@rust-lang.org> | 2017-08-09 01:30:02 +0000 |
| commit | 0f9317d37e5469a8ce3b77ed49dd3eb315c8c859 (patch) | |
| tree | 14608f3a3286b745c8f7f0691cf8004829c282f4 | |
| parent | 78efb23586407878606e6582b7a23486c099d35e (diff) | |
| parent | 4bb9a8b4ac27b48fb7989ef2900ec12a0face475 (diff) | |
| download | rust-0f9317d37e5469a8ce3b77ed49dd3eb315c8c859.tar.gz rust-0f9317d37e5469a8ce3b77ed49dd3eb315c8c859.zip | |
Auto merge of #43595 - oyvindln:master, r=aturon
Add an overflow check in the Iter::next() impl for Range<_> to help with vectorization. This helps with vectorization in some cases, such as (0..u16::MAX).collect::<Vec<u16>>(), as LLVM is able to change the loop condition to use equality instead of less than and should help with #43124. (See also my [last comment](https://github.com/rust-lang/rust/issues/43124#issuecomment-319098625) there.) This PR makes collect on ranges of u16, i16, i8, and u8 **significantly** faster (at least on x86-64 and i686), and pretty close, though not quite equivalent to a [manual unsafe implementation](https://is.gd/nkoecB). 32 ( and 64-bit values on x86-64) bit values were already vectorized without this change, and they still are. This PR doesn't seem to help with 64-bit values on i686, as they still don't vectorize well compared to doing a manual loop. I'm a bit unsure if this was the best way of implementing this, I tried to do it with as little changes as possible and avoided changing the step trait and the behavior in RangeFrom (I'll leave that for others like #43127 to discuss wider changes to the trait). I tried simply changing the comparison to `self.start != self.end` though that made the compiler segfault when compiling stage0, so I went with this method instead for now. As for `next_back()`, reverse ranges seem to optimise properly already.
| -rw-r--r-- | src/libcore/iter/range.rs | 13 |
1 files changed, 10 insertions, 3 deletions
diff --git a/src/libcore/iter/range.rs b/src/libcore/iter/range.rs index 32c32e327eb..73d518b570a 100644 --- a/src/libcore/iter/range.rs +++ b/src/libcore/iter/range.rs @@ -214,9 +214,16 @@ impl<A: Step> Iterator for ops::Range<A> { #[inline] fn next(&mut self) -> Option<A> { if self.start < self.end { - let mut n = self.start.add_one(); - mem::swap(&mut n, &mut self.start); - Some(n) + // We check for overflow here, even though it can't actually + // happen. Adding this check does however help llvm vectorize loops + // for some ranges that don't get vectorized otherwise, + // and this won't actually result in an extra check in an optimized build. + if let Some(mut n) = self.start.add_usize(1) { + mem::swap(&mut n, &mut self.start); + Some(n) + } else { + None + } } else { None } |
