diff options
| author | Dan Gohman <dev@sunfishcode.online> | 2022-05-07 09:34:57 -0700 |
|---|---|---|
| committer | Dan Gohman <dev@sunfishcode.online> | 2022-06-23 13:10:47 -0700 |
| commit | caf8bcceff301b4fa3414e3f21813581a8d758a3 (patch) | |
| tree | a9111e4c5d5b49b70878a7290737cae316c8532b /compiler/rustc_codegen_llvm/src | |
| parent | 10f4ce324baf7cfb7ce2b2096662b82b79204944 (diff) | |
| download | rust-caf8bcceff301b4fa3414e3f21813581a8d758a3.tar.gz rust-caf8bcceff301b4fa3414e3f21813581a8d758a3.zip | |
Optimize `Wtf8Buf::into_string` for the case where it contains UTF-8.
Add a `is_known_utf8` flag to `Wtf8Buf`, which tracks whether the string is known to contain UTF-8. This is efficiently computed in many common situations, such as when a `Wtf8Buf` is constructed from a `String` or `&str`, or with `Wtf8Buf::from_wide` which is already doing UTF-16 decoding and already checking for surrogates. This makes `OsString::into_string` O(1) rather than O(N) on Windows in common cases. And, it eliminates the need to scan through the string for surrogates in `Args::next` and `Vars::next`, because the strings are already being translated with `Wtf8Buf::from_wide`. Many things on Windows construct `OsString`s with `Wtf8Buf::from_wide`, such as `DirEntry::file_name` and `fs::read_link`, so with this patch, users of those functions can subsequently call `.into_string()` without paying for an extra scan through the string for surrogates.
Diffstat (limited to 'compiler/rustc_codegen_llvm/src')
0 files changed, 0 insertions, 0 deletions
