diff options
| author | Ali MJ Al-Nasrawy <alimjalnasrawy@gmail.com> | 2023-10-11 03:53:16 +0300 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2023-10-11 03:53:16 +0300 |
| commit | 38654ad74198757890bd4679f48b2318b0e84194 (patch) | |
| tree | 3e17f36b1be26bf18d1cc674a06e119589485696 /compiler/rustc_mir_transform/src/coverage/mod.rs | |
| parent | d627cf07ce46d230a93732a4714d16f00df9466b (diff) | |
| parent | 5facc32e22e8843a8c276305fff4ec84d718e1c0 (diff) | |
| download | rust-38654ad74198757890bd4679f48b2318b0e84194.tar.gz rust-38654ad74198757890bd4679f48b2318b0e84194.zip | |
Rollup merge of #95967 - CAD97:from-utf16, r=dtolnay
Add explicit-endian String::from_utf16 variants
This adds the following APIs under `feature(str_from_utf16_endian)`:
```rust
impl String {
pub fn from_utf16le(v: &[u8]) -> Result<String, FromUtf16Error>;
pub fn from_utf16le_lossy(v: &[u8]) -> String;
pub fn from_utf16be(v: &[u8]) -> Result<String, FromUtf16Error>;
pub fn from_utf16be_lossy(v: &[u8]) -> String;
}
```
These are versions of `String::from_utf16` that explicitly take [UTF-16LE and UTF-16BE](https://unicode.org/faq/utf_bom.html#gen7). Notably, we can do better than just the obvious `decode_utf16(v.array_chunks::<2>().copied().map(u16::from_le_bytes)).collect()` in that:
- We handle the case where the byte slice is not an even number of bytes, and
- In the case that the UTF-16 is native endian and the slice is aligned, we can forward to `String::from_utf16`.
If the Unicode Consortium actively defines how to handle character replacement when decoding a UTF-16 bytestream with a trailing odd byte, I was unable to find reference. However, the behavior implemented here is fairly self-evidently correct: replace the single errant byte with the replacement character.
Diffstat (limited to 'compiler/rustc_mir_transform/src/coverage/mod.rs')
0 files changed, 0 insertions, 0 deletions
