about summary refs log tree commit diff
path: root/compiler/rustc_codegen_llvm/src
diff options
context:
space:
mode:
authorDan Gohman <dev@sunfishcode.online>2022-05-07 09:34:57 -0700
committerDan Gohman <dev@sunfishcode.online>2022-06-23 13:10:47 -0700
commitcaf8bcceff301b4fa3414e3f21813581a8d758a3 (patch)
treea9111e4c5d5b49b70878a7290737cae316c8532b /compiler/rustc_codegen_llvm/src
parent10f4ce324baf7cfb7ce2b2096662b82b79204944 (diff)
downloadrust-caf8bcceff301b4fa3414e3f21813581a8d758a3.tar.gz
rust-caf8bcceff301b4fa3414e3f21813581a8d758a3.zip
Optimize `Wtf8Buf::into_string` for the case where it contains UTF-8.
Add a `is_known_utf8` flag to `Wtf8Buf`, which tracks whether the
string is known to contain UTF-8. This is efficiently computed in many
common situations, such as when a `Wtf8Buf` is constructed from a `String`
or `&str`, or with `Wtf8Buf::from_wide` which is already doing UTF-16
decoding and already checking for surrogates.

This makes `OsString::into_string` O(1) rather than O(N) on Windows in
common cases.

And, it eliminates the need to scan through the string for surrogates in
`Args::next` and `Vars::next`, because the strings are already being
translated with `Wtf8Buf::from_wide`.

Many things on Windows construct `OsString`s with `Wtf8Buf::from_wide`,
such as `DirEntry::file_name` and `fs::read_link`, so with this patch,
users of those functions can subsequently call `.into_string()` without
paying for an extra scan through the string for surrogates.
Diffstat (limited to 'compiler/rustc_codegen_llvm/src')
0 files changed, 0 insertions, 0 deletions