| Age | Commit message (Collapse) | Author | Lines |
|
make is_power_of_two a const function
This makes `is_power_of_two` a const function by using `&` instead of short-circuiting `&&`; Rust supports bitwise `&` for `bool` and short-circuiting is not required in the existing expression.
I don't think this needs a const-hack label as I don't find the changed code less readable, if anything I prefer that it is clearer that short circuiting is not used.
@oli-obk
|
|
|
|
Fix left/right shift typo in wrapping rotate docs
This makes the note similar to the one found on rotate functions for primitive types like i32/u32.
|
|
This makes the note similar to the one found on rotate functions for
primitive types like i32/u32.
|
|
|
|
Reciprocal throughput is improved from 2.3 to 1.7.
https://godbolt.org/z/ROMiX6
|
|
|
|
Inline `{min,max}_value` even in debug builds
I think it is worth to inline `{min,max}_value` even in debug builds.
See this godbolt link: https://godbolt.org/z/-COkVS
|
|
|
|
|
|
Master is now 1.40
r? @pietroalbini
|
|
use `sign` variable in abs and wrapping_abs methods
This also makes the code easier to understand by hinting at the significance of `self >> ($BITS - 1)`.
Also, now `overflowing_abs` simply uses `wrapping_abs`, which is clearer and avoids a potential performance regression in the LLVM IR.
This PR follows from the discussion from #63786.
r? @eddyb
cc @nikic
|
|
|
|
Instead let's do this via `RUSTFLAGS` in `builder.rs`. Currently
requires a submodule update of `stdarch` to fix a problem with previous
compilers.
|
|
|
|
This also makes the code easier to understand by hinting at the
significance of `self >> ($BITS - 1)` and by including an explanation
in the comments.
Also, now overflowing_abs simply uses wrapping_abs, which is clearer
and avoids a potential performance regression in the LLVM IR.
|
|
Make `abs`, `wrapping_abs`, `overflowing_abs` const functions
This makes `abs`, `wrapping_abs` and `overflowing_abs` const functions like #58044 makes `wrapping_neg` and `overflowing_neg` const functions.
`abs` is made const by returning `(self ^ -1) - -1` = `!self + 1` = `-self` for negative numbers and `(self ^ 0) - 0` = `self` for non-negative numbers. The subexpression `self >> ($BITS - 1)` evaluates to `-1` for negative numbers and `0` otherwise. The subtraction overflows when `self` is `min_value()`, as we would be subtracting `max_value() - -1`; this is when `abs` should overflow.
`wrapping_abs` and `overflowing_abs` make use of `wrapping_sub` and `overflowing_sub` instead of the subtraction operator.
|
|
Test that Wrapping arithmetic ops are implemented for all int types
Closes #49660
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Stablize Euclidean Modulo (feature euclidean_division)
Closes #49048
|
|
|
|
Use the same API as for integers.
Fixes #57492.
|
|
|
|
|
|
|
|
This uses a well-known branchless implementation of `signum`.
Its `const`-ness is unstable and requires `#![feature(const_int_sign)]`.
|
|
|
|
Also removes stage1, stage2 cfgs being passed to rustc to ensure that
stage1 and stage2 are only differentiated as a group (i.e., only through
not bootstrap).
|
|
|
|
Annotate each `reverse_bits` with `#[must_use]`
Because the name sounds like an in-place mutation like `[T]::reverse(&mut self)`, it may be confused for one.
This change was requested at https://github.com/rust-lang/rust/issues/48763#issuecomment-493743741.
|
|
Because the name sounds like an in-place mutation like
`[T]::reverse(&mut self)`, it may be confused for one.
|
|
is FFI safe
This allows types like Option<NonZeroU8> to be used in FFI without triggering the improper_ctypes lint. This works by changing the is_repr_nullable_ptr function to consider an enum E to be FFI-safe if:
- E has no explicit #[repr(...)].
- It only has two variants.
- One of those variants is empty (meaning it has no fields).
- The other variant has only one field.
- That field is one of the following:
- &T
- &mut T
- extern "C" fn
- core::num::NonZero*
- core::ptr::NonNull<T>
- #[repr(transparent)] struct wrapper around one of the types in this list.
- The size of E and its field are both known and are both the same size (implying E is participating in the nonnull optimization).
|
|
... on different platforms.
Official rustdoc of
[`usize::to_le_bytes`](https://doc.rust-lang.org/std/primitive.usize.html#method.to_le_bytes)
displays signature
```
pub fn to_ne_bytes(self) -> [u8; 8]
```
which might be misleading: this function returns 4 bytes on 32-bit
systems.
|
|
Add a tidy check for files with over 3,000 lines
Files with a large number of lines can cause issues in GitHub (e.g. https://github.com/rust-lang/rust/issues/60015) and also tend to be indicative of opportunities to refactor into less monolithic structures.
This adds a new check to tidy to warn against files that have more than 3,000 lines, as suggested in https://github.com/rust-lang/rust/issues/60015#issuecomment-483868594. (This number was chosen fairly arbitrarily as a reasonable indicator of size.) This check can be ignored with `// ignore-tidy-filelength`.
Existing files with greater than 3,000 lines currently ignore the check, but this helps us spot when files are getting too large. (We might try to split up all files larger than this in the future, as in https://github.com/rust-lang/rust/issues/60015).
|
|
|
|
Implement saturating_abs() and saturating_neg() functions for signed integer types
Similar to wrapping_abs() / wrapping_neg() functions but saturating at the numeric bounds instead of wrapping around. Complements the existing set of functions with saturation mechanics.
cc #59983
|
|
Similar to wrapping_abs() / wrapping_neg() functions but saturating at
the numeric bounds instead of wrapping around. Complements the existing
set of functions with saturation mechanics.
|
|
Warn on unused results for operation methods on nums
From a suggestion by @llogiq
Adds a `#[must_use]` attribute to operation methods on integers that take self by value as the first operand and another value as the second. It makes it clear that these methods return the result of the operation instead of mutating `self`, which is the source of a rather embarrassing bug I had in a codebase of mine recently...
As an example:
```rust
struct Int {
value: i64,
}
impl Int {
fn add(&mut self, other: i64) {
self.value.wrapping_add(other);
}
}
```
Will produce a warning like:
```
warning: unused return value of `core::num::<impl i64>::wrapping_add` that must be used
--> src/main.rs:7:7
|
7 | self.value.wrapping_add(other);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(unused_must_use)] on by default
= note: this returns the result of the operation, without modifying the original
```
If this is something we're on board with, we could do something similar for `f32` and `f64` too. There are probably other methods on integers that make sense.
|
|
|
|
|
|
|
|
|
|
|
|
Add FromStr impl for NonZero types
This is a WIP implementation because I do have some questions regarding the solution.
Somebody should ping the lang team on this I guess.
Please see the annotations on the code for more details.
Closes #58604
|
|
Make ASCII case conversions more than 4× faster
Reformatted output of `./x.py bench src/libcore --test-args ascii` below. The `libcore` benchmark calls `[u8]::make_ascii_lowercase`. `lookup` has code (effectively) identical to that before this PR, and ~~`branchless`~~ `mask_shifted_bool_match_range` after this PR.
~~See [code comments](https://github.com/rust-lang/rust/pull/59283/commits/ce933f77c865a15670855ac5941fe200752b739f#diff-01076f91a26400b2db49663d787c2576R3796) in `u8::to_ascii_uppercase` in `src/libcore/num/mod.rs` for an explanation of the branchless algorithm.~~
**Update:** the algorithm was simplified while keeping the performance. See `branchless` v.s. `mask_shifted_bool_match_range` benchmarks.
Credits to @raphlinus for the idea in https://twitter.com/raphlinus/status/1107654782544736261, which extends this algorithm to “fake SIMD” on `u32` to convert four bytes at a time. The `fake_simd_u32` benchmarks implements this with [`let (before, aligned, after) = bytes.align_to_mut::<u32>()`](https://doc.rust-lang.org/std/primitive.slice.html#method.align_to_mut). Note however that this is buggy when addition carries/overflows into the next byte (which does not happen if the input is known to be ASCII).
This could be fixed (to optimize `[u8]::make_ascii_lowercase` and `[u8]::make_ascii_uppercase` in `src/libcore/slice/mod.rs`) either with some more bitwise trickery that I didn’t quite figure out, or by using “real” SIMD intrinsics for byte-wise addition. I did not pursue this however because the current (incorrect) fake SIMD algorithm is only marginally faster than the one-byte-at-a-time branchless algorithm. This is because LLVM auto-vectorizes the latter, as can be seen on https://rust.godbolt.org/z/anKtbR.
Benchmark results on Linux x64 with Intel i7-7700K: (updated from https://github.com/rust-lang/rust/pull/59283#issuecomment-474146863)
```rust
6830 bytes string:
alloc_only ... bench: 112 ns/iter (+/- 0) = 62410 MB/s
black_box_read_each_byte ... bench: 1,733 ns/iter (+/- 8) = 4033 MB/s
lookup_table ... bench: 1,766 ns/iter (+/- 11) = 3958 MB/s
branch_and_subtract ... bench: 417 ns/iter (+/- 1) = 16762 MB/s
branch_and_mask ... bench: 401 ns/iter (+/- 1) = 17431 MB/s
branchless ... bench: 365 ns/iter (+/- 0) = 19150 MB/s
libcore ... bench: 367 ns/iter (+/- 1) = 19046 MB/s
fake_simd_u32 ... bench: 361 ns/iter (+/- 2) = 19362 MB/s
fake_simd_u64 ... bench: 361 ns/iter (+/- 1) = 19362 MB/s
mask_mult_bool_branchy_lookup_table ... bench: 6,309 ns/iter (+/- 19) = 1107 MB/s
mask_mult_bool_lookup_table ... bench: 4,183 ns/iter (+/- 29) = 1671 MB/s
mask_mult_bool_match_range ... bench: 339 ns/iter (+/- 0) = 20619 MB/s
mask_shifted_bool_match_range ... bench: 339 ns/iter (+/- 1) = 20619 MB/s
32 bytes string:
alloc_only ... bench: 15 ns/iter (+/- 0) = 2133 MB/s
black_box_read_each_byte ... bench: 29 ns/iter (+/- 0) = 1103 MB/s
lookup_table ... bench: 24 ns/iter (+/- 4) = 1333 MB/s
branch_and_subtract ... bench: 16 ns/iter (+/- 0) = 2000 MB/s
branch_and_mask ... bench: 16 ns/iter (+/- 0) = 2000 MB/s
branchless ... bench: 16 ns/iter (+/- 0) = 2000 MB/s
libcore ... bench: 15 ns/iter (+/- 0) = 2133 MB/s
fake_simd_u32 ... bench: 17 ns/iter (+/- 0) = 1882 MB/s
fake_simd_u64 ... bench: 16 ns/iter (+/- 0) = 2000 MB/s
mask_mult_bool_branchy_lookup_table ... bench: 42 ns/iter (+/- 0) = 761 MB/s
mask_mult_bool_lookup_table ... bench: 35 ns/iter (+/- 0) = 914 MB/s
mask_mult_bool_match_range ... bench: 16 ns/iter (+/- 0) = 2000 MB/s
mask_shifted_bool_match_range ... bench: 16 ns/iter (+/- 0) = 2000 MB/s
7 bytes string:
alloc_only ... bench: 14 ns/iter (+/- 0) = 500 MB/s
black_box_read_each_byte ... bench: 22 ns/iter (+/- 0) = 318 MB/s
lookup_table ... bench: 16 ns/iter (+/- 0) = 437 MB/s
branch_and_subtract ... bench: 16 ns/iter (+/- 0) = 437 MB/s
branch_and_mask ... bench: 16 ns/iter (+/- 0) = 437 MB/s
branchless ... bench: 19 ns/iter (+/- 0) = 368 MB/s
libcore ... bench: 20 ns/iter (+/- 0) = 350 MB/s
fake_simd_u32 ... bench: 18 ns/iter (+/- 0) = 388 MB/s
fake_simd_u64 ... bench: 21 ns/iter (+/- 0) = 333 MB/s
mask_mult_bool_branchy_lookup_table ... bench: 20 ns/iter (+/- 0) = 350 MB/s
mask_mult_bool_lookup_table ... bench: 19 ns/iter (+/- 0) = 368 MB/s
mask_mult_bool_match_range ... bench: 19 ns/iter (+/- 0) = 368 MB/s
mask_shifted_bool_match_range ... bench: 19 ns/iter (+/- 0) = 368 MB/s
```
|