| Age | Commit message (Collapse) | Author | Lines |
|
|
|
benches, and clean up stray semicolon
|
|
into own file
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Remove duplicated code from Iterator::{ne, lt, le, gt, ge}
This PR delegates `Iterator::ne` to `Iterator::eq` and `Iterator::{lt, le, gt, ge}` to `Iterator::partial_cmp`.
Oddly enough, this change actually simplifies the generated assembly [in some cases](https://rust.godbolt.org/z/riBtNe), although I don't understand assembly well enough to see if the longer assembly is doing something clever.
I also added two extremely simple benchmarks:
```
// before
test iter::bench_lt ... bench: 98,404 ns/iter (+/- 21,008)
test iter::bench_partial_cmp ... bench: 62,437 ns/iter (+/- 5,009)
// after
test iter::bench_lt ... bench: 61,757 ns/iter (+/- 8,770)
test iter::bench_partial_cmp ... bench: 62,151 ns/iter (+/- 13,753)
```
I have no idea why the current `lt`/`le`/`gt`/`ge` implementations don't seem to be compiled optimally, but simply having them call `partial_cmp` seems to be an improvement.
See #44729 for a previous discussion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
name old ns/iter new ns/iter diff ns/iter diff % speedup
iter::bench_cycle_take_ref_sum 927,152 927,194 42 0.00% x 1.00
iter::bench_cycle_take_sum 938,129 603,492 -334,637 -35.67% x 1.55
|
|
|
|
|
|
|
|
This permits easier iteration without having to worry about warnings
being denied.
Fixes #49517
|
|
Merge `feature(advanced_slice_patterns)` into `feature(slice_patterns)`
|
|
Makes the bench asked about on URLO 58x faster :)
|
|
Short-circuiting internal iteration with Iterator::try_fold & try_rfold
These are the core methods in terms of which the other methods (`fold`, `all`, `any`, `find`, `position`, `nth`, ...) can be implemented, allowing Iterator implementors to get the full goodness of internal iteration by only overriding one method (per direction).
Based off the `Try` trait, so works with both `Result` and `Option` (:tada: https://github.com/rust-lang/rust/pull/42526). The `try_fold` rustdoc examples use `Option` and the `try_rfold` ones use `Result`.
AKA continuing in the vein of PRs https://github.com/rust-lang/rust/pull/44682 & https://github.com/rust-lang/rust/pull/44856 for more of `Iterator`.
New bench following the pattern from the latter of those:
```
test iter::bench_take_while_chain_ref_sum ... bench: 1,130,843 ns/iter (+/- 25,110)
test iter::bench_take_while_chain_sum ... bench: 362,530 ns/iter (+/- 391)
```
I also ran the benches without the `fold` & `rfold` overrides to test their new default impls, with basically no change. I left them there, though, to take advantage of existing overrides and because `AlwaysOk` has some sub-optimality due to https://github.com/rust-lang/rust/issues/43278 (which 45225 should fix).
If you're wondering why there are three type parameters, see issue https://github.com/rust-lang/rust/issues/45462
Thanks for @bluss for the [original IRLO thread](https://internals.rust-lang.org/t/pre-rfc-fold-ok-is-composable-internal-iteration/4434) and the rfold PR and to @cuviper for adding so many folds, [encouraging me](https://github.com/rust-lang/rust/pull/45379#issuecomment-339424670) to make this PR, and finding a catastrophic bug in a pre-review.
|
|
unpredictable conditional branches in the loop. In addition improve the
benchmarks to test performance in l1, l2 and l3 caches on sorted arrays
with or without dups.
Before:
```
test slice::binary_search_l1 ... bench: 48 ns/iter (+/- 1)
test slice::binary_search_l2 ... bench: 63 ns/iter (+/- 0)
test slice::binary_search_l3 ... bench: 152 ns/iter (+/- 12)
test slice::binary_search_l1_with_dups ... bench: 36 ns/iter (+/- 0)
test slice::binary_search_l2_with_dups ... bench: 64 ns/iter (+/- 1)
test slice::binary_search_l3_with_dups ... bench: 153 ns/iter (+/- 6)
```
After:
```
test slice::binary_search_l1 ... bench: 15 ns/iter (+/- 0)
test slice::binary_search_l2 ... bench: 23 ns/iter (+/- 0)
test slice::binary_search_l3 ... bench: 100 ns/iter (+/- 17)
test slice::binary_search_l1_with_dups ... bench: 15 ns/iter (+/- 0)
test slice::binary_search_l2_with_dups ... bench: 23 ns/iter (+/- 0)
test slice::binary_search_l3_with_dups ... bench: 98 ns/iter (+/- 14)
```
|
|
This is the core method in terms of which the other methods (fold, all, any, find, position, nth, ...) can be implemented, allowing Iterator implementors to get the full goodness of internal iteration by only overriding one method (per direction).
|
|
|
|
remove FIXME(#13101) since `assert_receiver_is_total_eq` stays.
remove FIXME(#19649) now that stability markers render.
remove FIXME(#13642) now the benchmarks were moved.
remove FIXME(#6220) now that floating points can be formatted.
remove FIXME(#18248) and write tests for `Rc<str>` and `Rc<[u8]>`
remove reference to irelevent issues in FIXME(#1697, #2178...)
update FIXME(#5516) to point to getopts issue 7
update FIXME(#7771) to point to RFC 628
update FIXME(#19839) to point to issue 26925
|