| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
Use step_unchecked more liberally in range iter impls
Without these `_unchecked`, these operations on iterators of `char` fail to optimize out the unreachable panicking condition on overflow.
cc @cuviper https://github.com/rayon-rs/rayon/pull/771 where this was discovered.
|
|
|
|
|
|
Resolve overflow behavior for RangeFrom
This specifies a documented unspecified implementation detail of `RangeFrom` and makes it consistently implement the specified behavior.
Specifically, `(u8::MAX).next()` is defined to cause an overflow, and resolve that overflow in the same manner as the `Step::forward` implementation.
The inconsistency that has existed is `<RangeFrom as Iterator>::nth`. The existing behavior should be plain to see after #69659: the skipping part previously always panicked if it caused an overflow, but the final step (to set up the state for further iteration) has always been debug-checked.
The inconsistency, then, is that `RangeFrom::nth` does not implement the same behavior as the naive (and default) implementation of just calling `next` multiple times. This PR aligns `RangeFrom::nth` to have identical behavior to the naive implementation. It also lines up with the standard behavior of primitive math in Rust everywhere else in the language: debug checked overflow.
cc @Amanieu
---
Followup to #69659. Closes #25708 (by documenting the panic as intended).
The documentation wording is preliminary and can probably be improved.
This will probably need an FCP, as it changes observable stable behavior.
|
|
impl Step for char (make Range*<char> iterable)
[[irlo thread]](https://internals.rust-lang.org/t/mini-rfc-make-range-char-work/12392?u=cad97) [[godbolt asm example]](https://rust.godbolt.org/z/fdveKo)
Add an implementation of the `Step` trait for `char`, which has the effect of making `RangeInclusive<char>` (and the other range types) iterable.
I've used the surrogate range magic numbers as magic numbers here rather than e.g. a `const SURROGATE_RANGE = 0xD800..0xE000` because these numbers appear to be used as magic numbers elsewhere and there doesn't exist constants for them yet. These files definitely aren't where surrogate range constants should live.
`ExactSizeIterator` is not implemented because `0x10FFFF` is bigger than fits in a `usize == u16`. However, given we already provide some `ExactSizeIterator` that are not correct on 16 bit targets, we might still want to consider providing it for `Range`[`Inclusive`]`<char>`, as it is definitely _very_ convenient. (At the very least, we want to make sure `.count()` doesn't bother iterating the range.)
The second commit in this PR changes a call to `Step::forward` to use `Step::forward_unchecked` in `RangeInclusive::next`. This is because without this patch, iteration over all codepoints (`'\0'..=char::MAX`) does not successfully optimize out the panicking branch. This was mentioned in the PR that updated `Step` to its current design, but was deemed not yet necessary as it did not impact codegen for integral types.
More of `Range*`'s implementations' calls to `Step` methods will probably want to see if they can use the `_unchecked` version as (if) we open up `Step` to being implemented on more types.
---
cc @rust-lang/libs, this is insta-stable and a fairly significant addition to `Range*`'s capabilities; this is the first instance of a noncontinuous domain being iterable with `Range` (or, well, anything other than primitive integers). I don't think this needs a full RFC, but it should definitely get some decent eyes on it.
|
|
|
|
|
|
|
|
Enables Range<char> to be iterable
Note: https://rust.godbolt.org/z/fdveKo
An iteration over all char ('\0'..=char::MAX)
includes unreachable panic code currently.
Updating RangeInclusive::next to call
Step::forward_unchecked (which is safe to do
but not done yet becuase it wasn't necessary)
successfully removes the panic from this iteration.
|
|
|
|
|
|
`fold` is currently implemented via `try_fold`, but implementing it
directly results in slightly less LLVM IR being generated, speeding up
compilation of some benchmarks.
(And likewise for `rfold`.)
The commit adds `fold` implementations to all the iterators that lack
one but do have a `try_fold` implementation. Most of these just call the
`try_fold` implementation directly.
|
|
Rework the std::iter::Step trait
Previous attempts: #43127 #62886 #68807
Tracking issue: #42168
This PR reworks the `Step` trait to be phrased in terms of the *successor* and *predecessor* operations. With this, `Step` hopefully has a consistent identity that can have a path towards stabilization. The proposed trait:
```rust
/// Objects that have a notion of *successor* and *predecessor* operations.
///
/// The *successor* operation moves towards values that compare greater.
/// The *predecessor* operation moves towards values that compare lesser.
///
/// # Safety
///
/// This trait is `unsafe` because its implementation must be correct for
/// the safety of `unsafe trait TrustedLen` implementations, and the results
/// of using this trait can otherwise be trusted by `unsafe` code to be correct
/// and fulful the listed obligations.
pub unsafe trait Step: Clone + PartialOrd + Sized {
/// Returns the number of *successor* steps required to get from `start` to `end`.
///
/// Returns `None` if the number of steps would overflow `usize`
/// (or is infinite, or if `end` would never be reached).
///
/// # Invariants
///
/// For any `a`, `b`, and `n`:
///
/// * `steps_between(&a, &b) == Some(n)` if and only if `Step::forward(&a, n) == Some(b)`
/// * `steps_between(&a, &b) == Some(n)` if and only if `Step::backward(&a, n) == Some(a)`
/// * `steps_between(&a, &b) == Some(n)` only if `a <= b`
/// * Corollary: `steps_between(&a, &b) == Some(0)` if and only if `a == b`
/// * Note that `a <= b` does _not_ imply `steps_between(&a, &b) != None`;
/// this is the case wheen it would require more than `usize::MAX` steps to get to `b`
/// * `steps_between(&a, &b) == None` if `a > b`
fn steps_between(start: &Self, end: &Self) -> Option<usize>;
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`, returns `None`.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`:
///
/// * `Step::forward_checked(a, n).and_then(|x| Step::forward_checked(x, m)) == Step::forward_checked(a, m).and_then(|x| Step::forward_checked(x, n))`
///
/// For any `a`, `n`, and `m` where `n + m` does not overflow:
///
/// * `Step::forward_checked(a, n).and_then(|x| Step::forward_checked(x, m)) == Step::forward_checked(a, n + m)`
///
/// For any `a` and `n`:
///
/// * `Step::forward_checked(a, n) == (0..n).try_fold(a, |x, _| Step::forward_checked(&x, 1))`
/// * Corollary: `Step::forward_checked(&a, 0) == Some(a)`
fn forward_checked(start: Self, count: usize) -> Option<Self>;
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`,
/// this function is allowed to panic, wrap, or saturate.
/// The suggested behavior is to panic when debug assertions are enabled,
/// and to wrap or saturate otherwise.
///
/// Unsafe code should not rely on the correctness of behavior after overflow.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`, where no overflow occurs:
///
/// * `Step::forward(Step::forward(a, n), m) == Step::forward(a, n + m)`
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::forward_checked(a, n) == Some(Step::forward(a, n))`
/// * `Step::forward(a, n) == (0..n).fold(a, |x, _| Step::forward(x, 1))`
/// * Corollary: `Step::forward(a, 0) == a`
/// * `Step::forward(a, n) >= a`
/// * `Step::backward(Step::forward(a, n), n) == a`
fn forward(start: Self, count: usize) -> Self {
Step::forward_checked(start, count).expect("overflow in `Step::forward`")
}
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// # Safety
///
/// It is undefined behavior for this operation to overflow the
/// range of values supported by `Self`. If you cannot guarantee that this
/// will not overflow, use `forward` or `forward_checked` instead.
///
/// # Invariants
///
/// For any `a`:
///
/// * if there exists `b` such that `b > a`, it is safe to call `Step::forward_unchecked(a, 1)`
/// * if there exists `b`, `n` such that `steps_between(&a, &b) == Some(n)`,
/// it is safe to call `Step::forward_unchecked(a, m)` for any `m <= n`.
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::forward_unchecked(a, n)` is equivalent to `Step::forward(a, n)`
#[unstable(feature = "unchecked_math", reason = "niche optimization path", issue = "none")]
unsafe fn forward_unchecked(start: Self, count: usize) -> Self {
Step::forward(start, count)
}
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`, returns `None`.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`:
///
/// * `Step::backward_checked(a, n).and_then(|x| Step::backward_checked(x, m)) == n.checked_add(m).and_then(|x| Step::backward_checked(a, x))`
/// * `Step::backward_checked(a, n).and_then(|x| Step::backward_checked(x, m)) == try { Step::backward_checked(a, n.checked_add(m)?) }`
///
/// For any `a` and `n`:
///
/// * `Step::backward_checked(a, n) == (0..n).try_fold(a, |x, _| Step::backward_checked(&x, 1))`
/// * Corollary: `Step::backward_checked(&a, 0) == Some(a)`
fn backward_checked(start: Self, count: usize) -> Option<Self>;
/// Returns the value that would be obtained by taking the *predecessor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`,
/// this function is allowed to panic, wrap, or saturate.
/// The suggested behavior is to panic when debug assertions are enabled,
/// and to wrap or saturate otherwise.
///
/// Unsafe code should not rely on the correctness of behavior after overflow.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`, where no overflow occurs:
///
/// * `Step::backward(Step::backward(a, n), m) == Step::backward(a, n + m)`
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::backward_checked(a, n) == Some(Step::backward(a, n))`
/// * `Step::backward(a, n) == (0..n).fold(a, |x, _| Step::backward(x, 1))`
/// * Corollary: `Step::backward(a, 0) == a`
/// * `Step::backward(a, n) <= a`
/// * `Step::forward(Step::backward(a, n), n) == a`
fn backward(start: Self, count: usize) -> Self {
Step::backward_checked(start, count).expect("overflow in `Step::backward`")
}
/// Returns the value that would be obtained by taking the *predecessor*
/// of `self` `count` times.
///
/// # Safety
///
/// It is undefined behavior for this operation to overflow the
/// range of values supported by `Self`. If you cannot guarantee that this
/// will not overflow, use `backward` or `backward_checked` instead.
///
/// # Invariants
///
/// For any `a`:
///
/// * if there exists `b` such that `b < a`, it is safe to call `Step::backward_unchecked(a, 1)`
/// * if there exists `b`, `n` such that `steps_between(&b, &a) == Some(n)`,
/// it is safe to call `Step::backward_unchecked(a, m)` for any `m <= n`.
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::backward_unchecked(a, n)` is equivalent to `Step::backward(a, n)`
#[unstable(feature = "unchecked_math", reason = "niche optimization path", issue = "none")]
unsafe fn backward_unchecked(start: Self, count: usize) -> Self {
Step::backward(start, count)
}
}
```
Note that all of these are associated functions and not callable via method syntax; the calling syntax is always `Step::forward(start, n)`. This version of the trait additionally changes the stepping functions to talk their arguments by value.
As opposed to previous attempts which provided a "step by one" method directly, this version of the trait only exposes "step by n". There are a few reasons for this:
- `Range*`, the primary consumer of `Step`, assumes that the "step by n" operation is cheap. If a single step function is provided, it will be a lot more enticing to implement "step by n" as n repeated calls to "step by one". While this is not strictly incorrect, this behavior would be surprising for anyone used to using `Range<{primitive integer}>`.
- With a trivial default impl, this can be easily added backwards-compatibly later.
- The debug-wrapping "step by n" needs to exist for `RangeFrom` to be consistent between "step by n" and "step by one" operation. (Note: the behavior is not changed by this PR, but making the behavior consistent is made tenable by this PR.)
Three "kinds" of step are provided: `_checked`, which returns an `Option` indicating attempted overflow; (unsuffixed), which provides "safe overflow" behavior (is allowed to panic, wrap, or saturate, depending on what is most convenient for a given type); and `_unchecked`, which is a version which assumes overflow does not happen.
Review is appreciated to check that:
- The invariants as described on the `Step` functions are enough to specify the "common sense" consistency for successor/predecessor.
- Implementation of `Step` functions is correct in the face of overflow and the edges of representable integers.
- Added tests of `Step` functions are asserting the correct behavior (and not just the implemented behavior).
|
|
|
|
The previous definition did not optimize down to a single add operation,
but this version does appear to.
|
|
|
|
|
|
|
|
Co-Authored-By: Nadrieril Feneanar <nadrieril@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
While the redesign is in progress (#62886), clarify the purpose of replace_zero and replace_one.
|
|
|
|
|
|
|
|
|
|
|
|
We can use `usize::try_from` to convert steps from any size of integer.
This enables a meaningful `size_hint()` for larger ranges, rather than
always just `(0, None)`. Now they return the true `(len, Some(len))`
when it fits, otherwise `(usize::MAX, None)` for overflow.
|
|
RangeInclusive internal iteration performance improvement.
Specialize `Iterator::try_fold` and `DoubleEndedIterator::try_rfold` to improve code generation in all internal iteration scenarios.
This changes brings the performance of internal iteration with `RangeInclusive` on par with the performance of iteration with `Range`:
- Single conditional jump in hot loop,
- Unrolling and vectorization,
- And even Closed Form substitution.
Unfortunately, it only applies to internal iteration. Despite various attempts at stream-lining the implementation of `next` and `next_back`, LLVM has stubbornly refused to optimize external iteration appropriately, leaving me with a choice between:
- The current implementation, for which Closed Form substitution is performed, but which uses 2 conditional jumps in the hot loop when optimization fail.
- An implementation using a `is_done` boolean, which uses 1 conditional jump in the hot loop when optimization fail, allowing unrolling and vectorization, but for which Closed Form substitution fails.
In the absence of any conclusive evidence as to which usecase matters most, and with no assurance that the lack of Closed Form substitution is not indicative of other optimizations being foiled, there is no way
to pick one implementation over the other, and thus I defer to the statu quo as far as `next` and `next_back` are concerned.
|
|
|
|
|
|
Specialize Iterator::try_fold and DoubleEndedIterator::try_rfold to
improve code generation in all internal iteration scenarios.
This changes brings the performance of internal iteration with
RangeInclusive on par with the performance of iteration with Range:
- Single conditional jump in hot loop,
- Unrolling and vectorization,
- And even Closed Form substitution.
Unfortunately, it only applies to internal iteration. Despite various
attempts at stream-lining the implementation of next and next_back,
LLVM has stubbornly refused to optimize external iteration
appropriately, leaving me with a choice between:
- The current implementation, for which Closed Form substitution is
performed, but which uses 2 conditional jumps in the hot loop when
optimization fail.
- An implementation using a "is_done" boolean, which uses 1
conditional jump in the hot loop when optimization fail, allowing
unrolling and vectorization, but for which Closed Form substitution
fails.
In the absence of any conclusive evidence as to which usecase matters
most, and with no assurance that the lack of Closed Form substitution
is not indicative of other optimizations being foiled, there is no way
to pick one implementation over the other, and thus I defer to the
statu quo as far as next and next_back are concerned.
|
|
|
|
|
|
|
|
|
|
Fix #45222.
|
|
a portability lint"
This reverts commit 837d6c70233715a0ae8e15c703d40e3046a2f36a.
Fixes https://github.com/rust-lang/rust/issues/49415
|
|
Correct a few stability attributes
* `const_indexing` language feature was stabilized in 1.26.0 by #46882
* `Display` impls for `PanicInfo` and `Location` were stabilized in 1.26.0 by #47687
* `TrustedLen` is still unstable so its impls should be as well even though `RangeInclusive` was stabilized by #47813
* `!Send` and `!Sync` for `Args` and `ArgsOs` were stabilized in 1.26.0 by #48005
* `EscapeDefault` has been stable since 1.0.0 so should continue to show that even though it was moved to core in #48735
This could be backported to beta like #49612
|
|
|
|
Fixes #49617
|
|
portability lint
https://github.com/rust-lang/rust/pull/49305#issuecomment-376293243
|
|
Stabilize std::ops::RangeInclusive and std::ops::RangeInclusiveTo.
|
|
|
|
FusedIterator is a marker trait that promises that the implementing
iterator continues to return `None` from `.next()` once it has returned
`None` once (and/or `.next_back()`, if implemented).
The effects of FusedIterator are already widely available through
`.fuse()`, but with stable `FusedIterator`, stable Rust users can
implement this trait for their iterators when appropriate.
|
|
Simplify RangeInclusive::next[_back]
`match`ing on an `Option<Ordering>` seems cause some confusion for LLVM; switching to just using comparison operators removes a few jumps from the simple `for` loops I was trying.
cc https://github.com/rust-lang/rust/issues/45222 https://github.com/rust-lang/rust/issues/28237#issuecomment-363706510
Example:
```rust
#[no_mangle]
pub fn coresum(x: std::ops::RangeInclusive<u64>) -> u64 {
let mut sum = 0;
for i in x {
sum += i ^ (i-1);
}
sum
}
```
Today:
```asm
coresum:
xor r8d, r8d
mov r9, -1
xor eax, eax
jmp .LBB0_1
.LBB0_4:
lea rcx, [rdi - 1]
xor rcx, rdi
add rax, rcx
mov rsi, rdx
mov rdi, r10
.LBB0_1:
cmp rdi, rsi
mov ecx, 1
cmovb rcx, r9
cmove rcx, r8
test rcx, rcx
mov edx, 0
mov r10d, 1
je .LBB0_4 // 1
cmp rcx, -1
jne .LBB0_5 // 2
lea r10, [rdi + 1]
mov rdx, rsi
jmp .LBB0_4 // 3
.LBB0_5:
ret
```
With this PR:
```asm
coresum:
cmp rcx, rdx
jbe .LBB0_2
xor eax, eax
ret
.LBB0_2:
xor r8d, r8d
mov r9d, 1
xor eax, eax
.p2align 4, 0x90
.LBB0_3:
lea r10, [rcx + 1]
cmp rcx, rdx
cmovae rdx, r8
cmovae r10, r9
lea r11, [rcx - 1]
xor r11, rcx
add rax, r11
mov rcx, r10
cmp r10, rdx
jbe .LBB0_3 // Just this
ret
```
<details><summary>Though using internal iteration (`.map(|i| i ^ (i-1)).sum()`) is still shorter to type, and lets the compiler unroll it</summary>
```asm
coresum_inner:
.Lcfi0:
.seh_proc coresum_inner
sub rsp, 168
.Lcfi1:
.seh_stackalloc 168
vmovdqa xmmword ptr [rsp + 144], xmm15
.Lcfi2:
.seh_savexmm 15, 144
vmovdqa xmmword ptr [rsp + 128], xmm14
.Lcfi3:
.seh_savexmm 14, 128
vmovdqa xmmword ptr [rsp + 112], xmm13
.Lcfi4:
.seh_savexmm 13, 112
vmovdqa xmmword ptr [rsp + 96], xmm12
.Lcfi5:
.seh_savexmm 12, 96
vmovdqa xmmword ptr [rsp + 80], xmm11
.Lcfi6:
.seh_savexmm 11, 80
vmovdqa xmmword ptr [rsp + 64], xmm10
.Lcfi7:
.seh_savexmm 10, 64
vmovdqa xmmword ptr [rsp + 48], xmm9
.Lcfi8:
.seh_savexmm 9, 48
vmovdqa xmmword ptr [rsp + 32], xmm8
.Lcfi9:
.seh_savexmm 8, 32
vmovdqa xmmword ptr [rsp + 16], xmm7
.Lcfi10:
.seh_savexmm 7, 16
vmovdqa xmmword ptr [rsp], xmm6
.Lcfi11:
.seh_savexmm 6, 0
.Lcfi12:
.seh_endprologue
cmp rdx, rcx
jae .LBB1_2
xor eax, eax
jmp .LBB1_13
.LBB1_2:
mov r8, rdx
sub r8, rcx
jbe .LBB1_3
cmp r8, 7
jbe .LBB1_5
mov rax, r8
and rax, -8
mov r9, r8
and r9, -8
je .LBB1_5
add rax, rcx
vmovq xmm0, rcx
vpshufd xmm0, xmm0, 68
mov ecx, 1
vmovq xmm1, rcx
vpslldq xmm1, xmm1, 8
vpaddq xmm1, xmm0, xmm1
vpxor xmm0, xmm0, xmm0
vpcmpeqd xmm11, xmm11, xmm11
vmovdqa xmm12, xmmword ptr [rip + __xmm@00000000000000010000000000000001]
vmovdqa xmm13, xmmword ptr [rip + __xmm@00000000000000030000000000000003]
vmovdqa xmm14, xmmword ptr [rip + __xmm@00000000000000050000000000000005]
vmovdqa xmm15, xmmword ptr [rip + __xmm@00000000000000080000000000000008]
mov rcx, r9
vpxor xmm4, xmm4, xmm4
vpxor xmm5, xmm5, xmm5
vpxor xmm6, xmm6, xmm6
.p2align 4, 0x90
.LBB1_9:
vpaddq xmm7, xmm1, xmmword ptr [rip + __xmm@00000000000000020000000000000002]
vpaddq xmm9, xmm1, xmmword ptr [rip + __xmm@00000000000000040000000000000004]
vpaddq xmm10, xmm1, xmmword ptr [rip + __xmm@00000000000000060000000000000006]
vpaddq xmm8, xmm1, xmm12
vpxor xmm7, xmm8, xmm7
vpaddq xmm2, xmm1, xmm13
vpxor xmm8, xmm2, xmm9
vpaddq xmm3, xmm1, xmm14
vpxor xmm3, xmm3, xmm10
vpaddq xmm2, xmm1, xmm11
vpxor xmm2, xmm2, xmm1
vpaddq xmm0, xmm2, xmm0
vpaddq xmm4, xmm7, xmm4
vpaddq xmm5, xmm8, xmm5
vpaddq xmm6, xmm3, xmm6
vpaddq xmm1, xmm1, xmm15
add rcx, -8
jne .LBB1_9
vpaddq xmm0, xmm4, xmm0
vpaddq xmm0, xmm5, xmm0
vpaddq xmm0, xmm6, xmm0
vpshufd xmm1, xmm0, 78
vpaddq xmm0, xmm0, xmm1
vmovq r10, xmm0
cmp r8, r9
jne .LBB1_6
jmp .LBB1_11
.LBB1_3:
xor r10d, r10d
jmp .LBB1_12
.LBB1_5:
xor r10d, r10d
mov rax, rcx
.p2align 4, 0x90
.LBB1_6:
lea rcx, [rax - 1]
xor rcx, rax
inc rax
add r10, rcx
cmp rdx, rax
jne .LBB1_6
.LBB1_11:
mov rcx, rdx
.LBB1_12:
lea rax, [rcx - 1]
xor rax, rcx
add rax, r10
.LBB1_13:
vmovaps xmm6, xmmword ptr [rsp]
vmovaps xmm7, xmmword ptr [rsp + 16]
vmovaps xmm8, xmmword ptr [rsp + 32]
vmovaps xmm9, xmmword ptr [rsp + 48]
vmovaps xmm10, xmmword ptr [rsp + 64]
vmovaps xmm11, xmmword ptr [rsp + 80]
vmovaps xmm12, xmmword ptr [rsp + 96]
vmovaps xmm13, xmmword ptr [rsp + 112]
vmovaps xmm14, xmmword ptr [rsp + 128]
vmovaps xmm15, xmmword ptr [rsp + 144]
add rsp, 168
ret
.seh_handlerdata
.section .text,"xr",one_only,coresum_inner
.Lcfi13:
.seh_endproc
```
</details>
|