Rollup merge of #94713 - clarfonthey:is_char_surrogate, r=scottmcm - rust

diff options

author	Dylan DPC <99973273+Dylan-DPC@users.noreply.github.com>	2022-03-23 03:05:31 +0100
committer	GitHub <noreply@github.com>	2022-03-23 03:05:31 +0100
commit	25acd9331e61e773233c126a6117bd5614b945a7 (patch)
tree	bbc3a4c3ec6c88f579215e369d242cc509faca87 /compiler/rustc_codegen_gcc/example/std_example.rs
parent	0e86cabdceb4205572505b9d238f7a4e859c362b (diff)
parent	d5803678c16da47c807102635d7d6cfdac8bde82 (diff)
download	rust-25acd9331e61e773233c126a6117bd5614b945a7.tar.gz rust-25acd9331e61e773233c126a6117bd5614b945a7.zip

Rollup merge of #94713 - clarfonthey:is_char_surrogate, r=scottmcm

Add u16::is_utf16_surrogate

Right now, there are methods in the standard library for encoding and decoding UTF-16, but at least for the moment, there aren't any methods specifically for `u16` to help work with UTF-16 data. Since the full logic already exists, this wouldn't really add any code, just expose what's already there.

This method in particular is useful for working with the data returned by Windows `OsStrExt::encode_wide`. Initially, I was planning to also offer a `TryFrom<u16> for char`, but decided against it for now. There is plenty of code in rustc that could be rewritten to use this method, but I only checked within the standard library to replace them.

I think that offering more UTF-16-related methods to u16 would be useful, but I think this one is a good start. For example, one useful method might be `u16::is_pattern_whitespace`, which would check if something is the Unicode `Pattern_Whitespace` category. We can get away with this because all of the `Pattern_Whitespace` characters are in the basic multilingual plane, and hence we don't need to check for surrogates.

Diffstat (limited to 'compiler/rustc_codegen_gcc/example/std_example.rs')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: