| Age | Commit message (Collapse) | Author | Lines |
|
Signed-off-by: tison <wander4096@gmail.com>
|
|
This is the only trait specializable outside of the standard library.
Before stabilizing specialization we will probably want to remove
support for this. It was originally made specializable to allow a more
efficient ToString in libproc_macro back when this way the only way to
get any data out of a TokenStream. We now support getting individual
tokens, so proc macros no longer need to call it as often.
|
|
|
|
This allows to build custom `std::Formatter`s at runtime.
Also added some related enums and two related methods on `std::Formatter`.
|
|
|
|
optimize str.replace
Adds a fast path for str.replace for the ascii to ascii case. This allows for autovectorizing the code. Also should this instead be done with specialization? This way we could remove one branch. I think it is the kind of branch that is easy to predict though.
Benchmark for the fast path (replace all "a" with "b" in the rust wikipedia article, using criterion) :
| N | Speedup | Time New (ns) | Time Old (ns) |
|----------|---------|---------------|---------------|
| 2 | 2.03 | 13.567 | 27.576 |
| 8 | 1.73 | 17.478 | 30.259 |
| 11 | 2.46 | 18.296 | 45.055 |
| 16 | 2.71 | 17.181 | 46.526 |
| 37 | 4.43 | 18.526 | 81.997 |
| 64 | 8.54 | 18.670 | 159.470 |
| 200 | 9.82 | 29.634 | 291.010 |
| 2000 | 24.34 | 81.114 | 1974.300 |
| 20000 | 30.61 | 598.520 | 18318.000 |
| 1000000 | 29.31 | 33458.000 | 980540.000 |
|
|
liballoc: introduce String, Vec const-slicing
This change `const`-qualifies many methods on `Vec` and `String`, notably `as_slice`, `as_str`, `len`. These changes are made behind the unstable feature flag `const_vec_string_slice`.
## Motivation
This is to support simultaneous variance over ownership and constness. I have an enum type that may contain either `String` or `&str`, and I want to produce a `&str` from it in a possibly-`const` context.
```rust
enum StrOrString<'s> {
Str(&'s str),
String(String),
}
impl<'s> StrOrString<'s> {
const fn as_str(&self) -> &str {
match self {
// In a const-context, I really only expect to see this variant, but I can't switch the implementation
// in some mode like #[cfg(const)] -- there has to be a single body
Self::Str(s) => s,
// so this is a problem, since it's not `const`
Self::String(s) => s.as_str(),
}
}
}
```
Currently `String` and `Vec` don't support this, but can without functional changes. Similar logic applies for `len`, `capacity`, `is_empty`.
## Changes
The essential thing enabling this change is that `Unique::as_ptr` is `const`. This lets us convert `RawVec::ptr` -> `Vec::as_ptr` -> `Vec::as_slice` -> `String::as_str`.
I had to move the `Deref` implementations into `as_{str,slice}` because `Deref` isn't `#[const_trait]`, but I would expect this change to be invisible up to inlining. I moved the `DerefMut` implementations as well for uniformity.
|
|
This change `const`-qualifies many methods on Vec and String, notably
`as_slice`, `as_str`, `len`. These changes are made behind the unstable
feature flag `const_vec_string_slice` with the following tracking issue:
https://github.com/rust-lang/rust/issues/129041
|
|
|
|
|
|
Avoid re-validating UTF-8 in `FromUtf8Error::into_utf8_lossy`
Part of the unstable feature `string_from_utf8_lossy_owned` - #129436
Refactor `FromUtf8Error::into_utf8_lossy` to copy valid UTF-8 bytes into the buffer, avoiding double validation of bytes.
Add tests that mirror the `String::from_utf8_lossy` tests.
|
|
Refactor `into_utf8_lossy` to copy valid UTF-8 bytes into the buffer,
avoiding double validation of bytes.
Add tests that mirror the `String::from_utf8_lossy` tests
|
|
instead of path
|
|
|
|
items instead of paths
|
|
|
|
Implement feature `string_from_utf8_lossy_owned` for lossy conversion from `Vec<u8>` to `String` methods
Accepted ACP: https://github.com/rust-lang/libs-team/issues/116
Tracking issue: #129436
Implement feature for lossily converting from `Vec<u8>` to `String`
- Add `String::from_utf8_lossy_owned`
- Add `FromUtf8Error::into_utf8_lossy`
---
Related to #64727, but unsure whether to mark it "fixed" by this PR.
That issue partly asks for in-place replacement of the original allocation. We fulfill the other half of that request with these functions.
closes #64727
|
|
|
|
|
|
Implement feature for lossily converting from `Vec<u8>` to `String`
- Add `String::from_utf8_lossy_owned`
- Add `FromUtf8Error::into_utf8_lossy`
|
|
|
|
Fixes #128690
|
|
Add `#[must_use]` to some `into_raw*` functions.
cc #121287
r? ``@cuviper``
Adds `#[must_use = "losing the pointer will leak memory"]`[^1] to `Box::into_raw(_with_allocator)`, `Vec::into_raw_parts(_with_alloc)`, `String::into_raw_parts`[^2], and `rc::{Rc, Weak}::into_raw_with_allocator` (Rc's normal `into_raw` and all of `Arc`'s `into_raw*`s are already `must_use`).
Adds `#[must_use = "losing the raw <resource name may leak resources"]` to `IntoRawFd::into_raw_fd`, `IntoRawSocket::into_raw_socket`, and `IntoRawHandle::into_raw_handle`.
[^1]: "*will* leak memory" may be too-strong wording (since `Box`/`Vec`/`String`/`rc::Weak` might not have a backing allocation), but I left it as-is for simplicity and consistency.
[^2]: `String::into_raw_parts`'s `must_use` message is changed from the previous (possibly misleading) "`self` will be dropped if the result is not used".
|
|
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
|
|
Use a GAT for `Searcher` associated type because this trait is always
implemented for every lifetime anyway.
|
|
|
|
If/when `-Zmiri-disable-leak-check` is able to be used at test-granularity, it should applied to these tests instead of unleaking.
|
|
alloc: implement FromIterator for Box<str>
`Box<[T]>` implements `FromIterator<T>` using `Vec<T>` + `into_boxed_slice()`.
Add analogous `FromIterator` implementations for `Box<str>`
matching the current implementations for `String`.
Remove the `Global` allocator requirement for `FromIterator<Box<str>>` too.
ACP: https://github.com/rust-lang/libs-team/issues/196
|
|
|
|
Box<[T]> implements FromIterator<T> using Vec<T> + into_boxed_slice().
Add analogous FromIterator implementations for Box<str>
matching the current implementations for String.
Remove the Global allocator requirement for FromIterator<Box<str>> too.
|
|
|
|
"is greater or equal to". Beside common sense.
|
|
Stabilize `Utf8Chunks`
Pending FCP in https://github.com/rust-lang/rust/issues/99543.
This PR includes the proposed modification in https://github.com/rust-lang/libs-team/issues/190 as agreed in https://github.com/rust-lang/rust/issues/99543#issuecomment-2050406568.
|
|
|
|
Document overrides of `clone_from()` in core/std
As mentioned in https://github.com/rust-lang/rust/pull/96979#discussion_r1379502413
Specifically, when an override doesn't just forward to an inner type, document the behavior and that it's preferred over simply assigning a clone of source. Also, change instances where the second parameter is "other" to "source".
I reused some of the wording over and over for similar impls, but I'm not sure that the wording is actually *good*. Would appreciate feedback about that.
Also, now some of these seem to provide pretty specific guarantees about behavior (e.g. will reuse the exact same allocation iff the len is the same), but I was basing it off of the docs for [`Box::clone_from`](https://doc.rust-lang.org/1.75.0/std/boxed/struct.Box.html#method.clone_from-1) - I'm not sure if providing those strong guarantees is actually good or not.
|
|
Just a tiny addition.
Helps with #123263.
|
|
Require `DerefMut` and `DerefPure` on `deref!()` patterns when appropriate
Waiting on the deref pattern syntax pr to merge
r? nadrieril
|
|
|
|
|
|
|
|
Specifically, when an override doesn't just forward to an inner type,
document the behavior and that it's preferred over simply assigning
a clone of source. Also, change instances where the second parameter is
"other" to "source".
|
|
#91913
|
|
Have `String` use `SliceIndex` impls from `str`
This PR simplifies the implementation of `Index` and `IndexMut` on `String`, and in the process enables indexing `String` by any user types that implement `SliceIndex<str>`.
Similar to #47832
r? libs
Not sure if this warrants a crater run.
|
|
|
|
|
|
order
|
|
|
|
Help with common API confusion, like asking for `push` when the data structure really has `append`.
```
error[E0599]: no method named `size` found for struct `Vec<{integer}>` in the current scope
--> $DIR/rustc_confusables_std_cases.rs:17:7
|
LL | x.size();
| ^^^^
|
help: you might have meant to use `len`
|
LL | x.len();
| ~~~
help: there is a method with a similar name
|
LL | x.resize();
| ~~~~~~
```
#59450
|
|
Add explicit-endian String::from_utf16 variants
This adds the following APIs under `feature(str_from_utf16_endian)`:
```rust
impl String {
pub fn from_utf16le(v: &[u8]) -> Result<String, FromUtf16Error>;
pub fn from_utf16le_lossy(v: &[u8]) -> String;
pub fn from_utf16be(v: &[u8]) -> Result<String, FromUtf16Error>;
pub fn from_utf16be_lossy(v: &[u8]) -> String;
}
```
These are versions of `String::from_utf16` that explicitly take [UTF-16LE and UTF-16BE](https://unicode.org/faq/utf_bom.html#gen7). Notably, we can do better than just the obvious `decode_utf16(v.array_chunks::<2>().copied().map(u16::from_le_bytes)).collect()` in that:
- We handle the case where the byte slice is not an even number of bytes, and
- In the case that the UTF-16 is native endian and the slice is aligned, we can forward to `String::from_utf16`.
If the Unicode Consortium actively defines how to handle character replacement when decoding a UTF-16 bytestream with a trailing odd byte, I was unable to find reference. However, the behavior implemented here is fairly self-evidently correct: replace the single errant byte with the replacement character.
|
|
|