rust - https://github.com/rust-lang/rust

Age	Commit message (Collapse)	Author	Lines
2023-12-10	remove redundant imports	surechen	-3/+0
	detects redundant imports that can be eliminated. for #117772 : In order to facilitate review and modification, split the checking code and removing redundant imports code into two PR.
2023-11-27	benchmarks for Chars::advance_by	The 8472	-0/+19

2023-07-23	fix	Deadbeef	-0/+1

2023-06-23	Specialize StepBy<Range<{integer}>>	The 8472	-0/+52
	For ranges < usize we determine the number of items StepBy would yield and then store that in the range.end instead of the actual end. This significantly simplifies calculation of the loop induction variable especially in cases where StepBy::step (an usize) could overflow the Range's item type
2023-06-12	add benchmark	The 8472	-0/+9

2023-05-20	optimize next_chunk impls for Filter and FilterMap	The 8472	-2/+44

2023-05-15	Rollup merge of #108291 - chenyukang:yukang/fix-benchmarks, r=workingjubilee	Matthias Krüger	-30/+30
	Fix more benchmark test with black_box Follow up fix for https://github.com/rust-lang/rust/issues/107590
2023-04-25	Add shortcut for Grisu3 algorithm.	mazong1123	-0/+27
	Check requested digit length and the fractional or integral parts of the number. Falls back earlier without trying the Grisu algorithm if the specific condition meets. Fix #110129
2023-03-05	Auto merge of #108157 - scottmcm:tuple-gt-via-partialcmp, r=dtolnay	bors	-0/+23
	Use `partial_cmp` to implement tuple `lt`/`le`/`ge`/`gt` In today's implementation, `(A, B)::gt` contains calls to both `A::eq` and `A::gt`. That's fine for primitives, but for things like `String`s it's kinda weird -- `(String, usize)::gt` has a call to both `bcmp` and `memcmp` (<https://rust.godbolt.org/z/7jbbPMesf>) because when `bcmp` says the `String`s aren't equal, it turns around and calls `memcmp` to find out which one's bigger. This PR changes the implementation to instead implement `(A, …, C, Z)::gt` using `A::partial_cmp`, `…::partial_cmp`, `C::partial_cmp`, and `Z::gt`. (And analogously for `lt`, `le`, and `ge`.) That way expensive comparisons don't need to be repeated. Technically this is an observable change on stable, so I've marked it `needs-fcp` + `T-libs-api` and will r? rust-lang/libs-api I'm hoping that this will be non-controversial, however, since it's very similar to the observable changes that were made to the derives (#81384 #98655) -- like those, this only changes behaviour if a type overrode behaviour in a way inconsistent with the rules for the various traits involved. (The first commit here is #108156, adding the codegen test, which I used to make sure this doesn't regress behaviour for primitives.) Zulip conversation about this change: <https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/.60.3E.60.20on.20Tuples/near/328392927>.
2023-02-21	fix more benchmark test with black_box	yukang	-30/+30

2023-02-17	Add a slightly-contrived tuple comparison benchmark	Scott McMurray	-0/+23

2023-02-14	Shrink size of array benchmarks	kadmin	-5/+5

2023-02-11	Add array::map benchmarks	kadmin	-0/+20

2023-02-03	fix #107590, Fix benchmarks in library/core with black_box	yukang	-32/+44

2023-01-04	Update rand in the stdlib tests, and remove the getrandom feature from it	Thom Chiovoloni	-2/+2

2022-11-09	Rollup merge of #103570 - lukas-code:stabilize-ilog, r=scottmcm	Dylan DPC	-1/+0
	Stabilize integer logarithms Stabilizes feature `int_log`. I've also made the functions const stable, because they don't depend on any unstable const features. `rustc_allow_const_fn_unstable` is just there for `Option::expect`, which could be replaced with a `match` and `panic!`. cc ``@rust-lang/wg-const-eval`` closes https://github.com/rust-lang/rust/issues/70887 (tracking issue) ~~blocked on FCP finishing: https://github.com/rust-lang/rust/issues/70887#issuecomment-1289028216~~ FCP finished: https://github.com/rust-lang/rust/issues/70887#issuecomment-1302121266
2022-11-07	add benchmark for iter::ArrayChunks::fold specialization	The 8472	-2/+22
	This also updates the existing iter::Copied::next_chunk benchmark so that the thing it benches doesn't get masked by the ArrayChunks specialization
2022-10-26	stabilize `int_log`	Lukas Markeffsky	-1/+0

2022-10-17	add a benchmark for slice_iter.copied().array_chunks()	The 8472	-0/+21

2022-08-21	Use internal iteration in `Iterator::{cmp_by, partial_cmp_by, eq_by}`	Tim Vermeulen	-0/+7

2022-08-09	Rename integer log* methods to ilog*	Eric Holk	-3/+3
	This reflects the concensus from the libs team as reported at https://github.com/rust-lang/rust/issues/70887#issuecomment-1209513261 Co-authored-by: Yosh Wuyts <github@yosh.is>
2022-05-31	Add unicode fast path to `is_printable`	Nilstrieb	-0/+11
	Before, it would enter the full expensive check even for normal ascii characters. Now, it skips the check for the ascii characters in `32..127`. This range was checked manually from the current behavior.
2022-05-05	Auto merge of #96626 - thomcc:rand-bump, r=m-ou-se	bors	-2/+10
	Avoid using `rand::thread_rng` in the stdlib benchmarks. This is kind of an anti-pattern because it introduces extra nondeterminism for no real reason. In thread_rng's case this comes both from the random seed and also from the reseeding operations it does, which occasionally does syscalls (which adds additional nondeterminism). The impact of this would be pretty small in most cases, but it's a good practice to avoid (particularly because avoiding it was not hard). Anyway, several of our benchmarks already did the right thing here anyway, so the change was pretty easy and mostly just applying it more universally. That said, the stdlib benchmarks aren't particularly stable (nor is our benchmark framework particularly great), so arguably this doesn't matter that much in practice. ~~Anyway, this also bumps the `rand` dev-dependency to 0.8, since it had fallen somewhat out of date.~~ Nevermind, too much of a headache.
2022-05-02	add benchmark	The 8472	-0/+25

2022-05-02	Avoid use of `rand::thread_rng` in stdlib benchmarks	Thom Chiovoloni	-2/+10

2022-04-15	Make some `usize`-typed masks definition agnostic to the size of `usize`	Eduardo Sánchez Muñoz	-1/+1
	Some masks where defined as ```rust const NONASCII_MASK: usize = 0x80808080_80808080u64 as usize; ``` where it was assumed that `usize` is never wider than 64, which is currently true. To make those constants valid in a hypothetical 128-bit target, these constants have been redefined in an `usize`-width-agnostic way ```rust const NONASCII_MASK: usize = usize::from_ne_bytes([0x80; size_of::<usize>()]); ``` There are already some cases where Rust anticipates the possibility of supporting 128-bit targets, such as not implementing `From<usize>` for `u64`.
2022-03-10	Use implicit capture syntax in format_args	T-O-R-U-S	-2/+2
	This updates the standard library's documentation to use the new syntax. The documentation is worthwhile to update as it should be more idiomatic (particularly for features like this, which are nice for users to get acquainted with). The general codebase is likely more hassle than benefit to update: it'll hurt git blame, and generally updates can be done by folks updating the code if (and when) that makes things more readable with the new format. A few places in the compiler and library code are updated (mostly just due to already having been done when this commit was first authored).
2022-02-21	Stop manually SIMDing in swap_nonoverlapping	Scott McMurray	-4/+39
	Like I previously did for `reverse`, this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have. It does still need logic to type-erase where appropriate, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`. As a bonus, this also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>
2022-02-05	Respond to review feedback, and improve implementation somewhat	Thom Chiovoloni	-8/+19

2022-02-05	Fix zh::SMALL string in core::str benchmarks	Thom Chiovoloni	-1/+1

2022-02-05	Optimize `core::str::Chars::count`	Thom Chiovoloni	-26/+187

2021-10-12	Auto merge of #88788 - falk-hueffner:speedup-int-log10-branchless, ↵	bors	-0/+60
	r=joshtriplett Speedup int log10 branchless This is achieved with a branchless bit-twiddling implementation of the case x < 100_000, and using this as building block. Benchmark on an Intel i7-8700K (Coffee Lake): ``` name old ns/iter new ns/iter diff ns/iter diff % speedup num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98 num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04 num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04 num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52 num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93 num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01 num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12 num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23 num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31 num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26 num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22 num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37 num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00 num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96 num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05 ``` Benchmark on an M1 Mac Mini: ``` name old ns/iter new ns/iter diff ns/iter diff % speedup num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10 num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15 num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16 num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55 num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96 num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56 num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28 num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06 num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06 num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30 num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26 num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26 num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01 num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02 num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00 ``` The 128 bit case remains ridiculously slow because llvm fails to optimize division by a constant 128-bit value to multiplications. This could be worked around but it seems preferable to fix this in llvm. From u32 up, table lookup (like suggested [here](https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813)) is still faster, but requires a hardware `leading_zeros` to be viable, and might clog up the cache.
2021-09-11	benchmark for str.chars().count()	The8472	-0/+34

2021-09-09	Cosmetic fixes.	Falk Hüffner	-2/+0

2021-09-06	Add benchmark for integer log10.	Falk Hüffner	-0/+62

2021-06-23	Use HTTPS links where possible	Smitty	-1/+1

2021-04-25	move core::hint::black_box under its own feature gate	Ralf Jung	-4/+4

2021-03-18	add bench	The8472	-0/+13

2021-03-17	Auto merge of #81358 - mcastorina:to-upper-lower-speed, r=joshtriplett	bors	-0/+30
	Add a check for ASCII characters in to_upper and to_lower This extra check has better performance. See discussion here: https://internals.rust-lang.org/t/to-upper-speed/13896 Thanks to `@gilescope` for helping discover and test this.
2021-03-05	Auto merge of #74024 - Folyd:master, r=m-ou-se	bors	-6/+38
	Improve slice.binary_search_by()'s best-case performance to O(1) This PR aimed to improve the [slice.binary_search_by()](https://doc.rust-lang.org/std/primitive.slice.html#method.binary_search_by)'s best-case performance to O(1). # Noticed I don't know why the docs of `binary_search_by` said `"If there are multiple matches, then any one of the matches could be returned."`, but the implementation isn't the same thing. Actually, it returns the last one if multiple matches found. Then we got two options: ## If returns the last one is the correct or desired result Then I can rectify the docs and revert my changes. ## If the docs are correct or desired result Then my changes can be merged after fully reviewed. However, if my PR gets merged, another issue raised: this could be a breaking change since if multiple matches found, the returning order no longer the last one instead of it could be any one. For example: ```rust let mut s = vec![0, 1, 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55]; let num = 1; let idx = s.binary_search(&num); s.insert(idx, 2); // Old implementations assert_eq!(s, [0, 1, 1, 1, 1, 2, 2, 3, 5, 8, 13, 21, 34, 42, 55]); // New implementations assert_eq!(s, [0, 1, 1, 1, 2, 1, 2, 3, 5, 8, 13, 21, 34, 42, 55]); ``` # Benchmarking Old implementations ```sh $ ./x.py bench --stage 1 library/libcore test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 4) test slice::binary_search_l1_with_dups ... bench: 59 ns/iter (+/- 3) test slice::binary_search_l2 ... bench: 76 ns/iter (+/- 5) test slice::binary_search_l2_with_dups ... bench: 77 ns/iter (+/- 17) test slice::binary_search_l3 ... bench: 183 ns/iter (+/- 23) test slice::binary_search_l3_with_dups ... bench: 185 ns/iter (+/- 19) ``` New implementations (1) Implemented by this PR. ```rust if cmp == Equal { return Ok(mid); } else if cmp == Less { base = mid } ``` ```sh $ ./x.py bench --stage 1 library/libcore test slice::binary_search_l1 ... bench: 58 ns/iter (+/- 2) test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 4) test slice::binary_search_l2 ... bench: 76 ns/iter (+/- 3) test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 6) test slice::binary_search_l3 ... bench: 200 ns/iter (+/- 30) test slice::binary_search_l3_with_dups ... bench: 157 ns/iter (+/- 6) $ ./x.py bench --stage 1 library/libcore test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 8) test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 2) test slice::binary_search_l2 ... bench: 77 ns/iter (+/- 2) test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 2) test slice::binary_search_l3 ... bench: 198 ns/iter (+/- 21) test slice::binary_search_l3_with_dups ... bench: 158 ns/iter (+/- 11) ``` New implementations (2) Suggested by `@nbdd0121` in [comment](https://github.com/rust-lang/rust/pull/74024#issuecomment-665430239). ```rust base = if cmp == Greater { base } else { mid }; if cmp == Equal { break } ``` ```sh $ ./x.py bench --stage 1 library/libcore test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 7) test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 5) test slice::binary_search_l2 ... bench: 75 ns/iter (+/- 3) test slice::binary_search_l2_with_dups ... bench: 56 ns/iter (+/- 3) test slice::binary_search_l3 ... bench: 195 ns/iter (+/- 15) test slice::binary_search_l3_with_dups ... bench: 151 ns/iter (+/- 7) $ ./x.py bench --stage 1 library/libcore test slice::binary_search_l1 ... bench: 57 ns/iter (+/- 2) test slice::binary_search_l1_with_dups ... bench: 38 ns/iter (+/- 2) test slice::binary_search_l2 ... bench: 77 ns/iter (+/- 11) test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 4) test slice::binary_search_l3 ... bench: 194 ns/iter (+/- 15) test slice::binary_search_l3_with_dups ... bench: 151 ns/iter (+/- 18) ``` I run some benchmarking testings against on two implementations. The new implementation has a lot of improvement in duplicates cases, while in `binary_search_l3` case, it's a little bit slower than the old one.
2021-02-26	Add two more benchmarks for strictly ASCII and non ASCII cases	Miccah Castorina	-2/+22

2021-02-26	Add to_lowercase and to_uppercase char benchmarks	Miccah Castorina	-0/+10

2021-02-08	Unify way to flip 6th bit. (Same assembly generated)	Giles Cope	-2/+4

2021-02-06	Slight perf improvement on char::to_ascii_lowercase	Giles Cope	-0/+10

2021-01-30	Improve slice.binary_search_by()'s best-case performance to O(1)	Folyd	-6/+38

2021-01-08	Add more benchmarks	kadmin	-1/+23

2020-10-18	Remove redundant 'static from library crates	est31	-3/+3

2020-09-28	Use more efficient scheme for display u128/i128	kadmin	-0/+29
	Add zero padding Add benchmarks for fmt u128 This tests both when there is the max amount of work(all characters used) And least amount of work(1 character used)
2020-09-02	flt2dec: properly handle uninitialized memory	Ralf Jung	-32/+66

2020-07-27	mv std libs to library/	mark	-0/+1552