| Age | Commit message (Collapse) | Author | Lines |
|
The minimum supported Windows version is now Windows 7. Windows XP
and Windows Vista are no longer supported; both are already broken, and
require extra steps to use.
This commit removes the delayed-binding support for Windows API
functions that are present on all supported Windows targets. This has
several benefits: Removes needless complexity. Removes a load and
dynamic call on hot paths in mutex acquire / release. This may have
performance benefits.
* "Drop official support for Windows XP"
https://github.com/rust-lang/compiler-team/issues/378
* "Firefox has ended support for Windows XP and Vista"
https://support.mozilla.org/en-US/kb/end-support-windows-xp-and-vista
|
|
|
|
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
|
|
Co-authored-by: Mara Bos <m-ou.se@m-ou.se>
|
|
|
|
|
|
Refactor and fix `parse_prefix` on Windows
This PR is an extension of #78692 as well as a general refactor of `parse_prefix`:
**Fixes**:
There are two errors in the current implementation of `parse_prefix`:
Firstly, in the current implementation only `\` is recognized as a separator character in device namespace prefixes. This behavior is only correct for verbatim paths; `"\\.\C:/foo"` should be parsed as `"C:"` instead of `"C:/foo"`.
Secondly, the current implementation only handles single separator characters. In non-verbatim paths a series of separator characters should be recognized as a single boundary, e.g. the UNC path `"\\localhost\\\\\\C$\foo"` should be parsed as `"\\localhost\\\\\\C$"` and then `UNC(server: "localhost", share: "C$")`, but currently it is not parsed at all, because it starts being parsed as `\\localhost\` and then has an invalid empty share location.
Paths like `"\\.\C:/foo"` and `"\\localhost\\\\\\C$\foo"` are valid on Windows, they are equivalent to just `"C:\foo"`.
**Refactoring**:
All uses of `&[u8]` within `parse_prefix` are extracted to helper functions and`&OsStr` is used instead. This reduces the number of places unsafe is used:
- `get_first_two_components` is adapted to the more general `parse_next_component` and used in more places
- code for parsing drive prefixes is extracted to `parse_drive`
|
|
Add fast futex-based thread parker for Windows.
This adds a fast futex-based thread parker for Windows. It either uses WaitOnAddress+WakeByAddressSingle or NT Keyed Events (NtWaitForKeyedEvent+NtReleaseKeyedEvent), depending on which is available. Together, this makes this thread parker work for Windows XP and up. Before this change, park()/unpark() did not work on Windows XP: it needs condition variables, which only exist since Windows Vista.
---
Unfortunately, NT Keyed Events are an undocumented Windows API. However:
- This API is relatively simple with obvious behaviour, and there are several (unofficial) articles documenting the details. [1]
- parking_lot has been using this API for years (on Windows versions before Windows 8). [2] Many big projects extensively use parking_lot, such as servo and the Rust compiler itself.
- It is the underlying API used by Windows SRW locks and Windows critical sections. [3] [4]
- The source code of the implementations of Wine, ReactOs, and Windows XP are available and match the expected behaviour.
- The main risk with an undocumented API is that it might change in the future. But since we only use it for older versions of Windows, that's not a problem.
- Even if these functions do not block or wake as we expect (which is unlikely, see all previous points), this implementation would still be memory safe. The NT Keyed Events API is only used to sleep/block in the right place.
[1]\: http://www.locklessinc.com/articles/keyed_events/
[2]\: https://github.com/Amanieu/parking_lot/commit/43abbc964e
[3]\: https://docs.microsoft.com/en-us/archive/msdn-magazine/2012/november/windows-with-c-the-evolution-of-synchronization-in-windows-and-c
[4]\: Windows Internals, Part 1, ISBN 9780735671300
---
The choice of fallback API is inspired by parking_lot(_core), but the implementation of this thread parker is different. While parking_lot has no use for a fast path (park() directly returning if unpark() was already called), this implementation has a fast path that returns without even checking which waiting/waking API to use, as the same atomic variable with compatible states is used in all cases.
|
|
|
|
|
|
Extend meta parameters to all generated code in compat_fn.
Fixes https://github.com/rust-lang/rust/issues/79203. This addresses a regression from 7e2032390cf34f3ffa726b7bd890141e2684ba63 for UWP targets.
|
|
Disambiguate symlink argument names
The current argument naming in the following standard library functions is horribly ambiguous.
- std::os::unix::fs::symlink: https://doc.rust-lang.org/1.47.0/std/os/unix/fs/fn.symlink.html
- std::os::windows::fs::symlink_file: https://doc.rust-lang.org/1.47.0/std/os/windows/fs/fn.symlink_file.html
- std::os::windows::fs::symlink_dir: https://doc.rust-lang.org/1.47.0/std/os/windows/fs/fn.symlink_dir.html
**Notice that Swift uses one of the same names we do (`dst`) to refer to the opposite thing.**
<br>
| | the one that exists | the one that is<br>being created | reference |
| --- | --- | --- | --- |
| Rust | `src` | `dst` | |
| Swift | `withDestinationPath`<br>`destPath` | `atPath`<br>`path` | <sub>https://developer.apple.com/documentation/foundation/filemanager/1411007-createsymboliclink</sub> |
| D | `original` | `link` | <sub>https://dlang.org/library/std/file/symlink.html</sub> |
| Go | `oldname` | `newname` | <sub>https://golang.org/pkg/os/#Symlink</sub> |
| C++| `target` | `link` | <sub>https://en.cppreference.com/w/cpp/filesystem/create_symlink</sub> |
| POSIX | `path1` | `path2` | <sub>https://pubs.opengroup.org/onlinepubs/9699919799/functions/symlink.html</sub> |
| Linux | `target` | `linkpath` | <sub>https://man7.org/linux/man-pages/man2/symlink.2.html</sub> |
Out of these I happen to like D's argument names and am proposing that we adopt them.
|
|
|
|
Remove all rustc-link-lib from the std build script. Also remove use of
feature = "restricted-std" where not necessary.
|
|
|
|
Refactor `get_first_two_components` to `get_next_component`.
Fixes the following behaviour of `parse_prefix`:
- series of separator bytes in a prefix are correctly parsed as a single separator
- device namespace prefixes correctly recognize both `\\` and `/` as separators
|
|
Avoid SeqCst or static mut in mach_timebase_info and QueryPerformanceFrequency caches
This patch went through a couple iterations but the end result is replacing a pattern where an `AtomicUsize` (updated with many SeqCst ops) guards a `static mut` with a single `AtomicU64` that is known to use 0 as a value indicating that it is not initialized.
The code in both places exists to cache values used in the conversion of Instants to Durations on macOS, iOS, and Windows.
I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), but it's much simpler, safer, and in practice we'd expect it to be faster everywhere where Relaxed operations on AtomicU64 are cheaper than SeqCst operations on AtomicUsize, which is a lot of places.
Anyway, it also removes a bunch of unsafe code and greatly simplifies the logic, so IMO that alone would be worth it unless it was a regression.
If you want to take a look at the assembly output though, see https://godbolt.org/z/rbr6vn for x86_64, https://godbolt.org/z/cqcbqv for aarch64 (Note that this just the output of the mac side, but i'd expect the windows part to be the same and don't feel like doing another godbolt for it). There are several versions of this function in the godbolt:
- `info_new`: version in the current patch
- `info_less_new`: version in initial PR
- `info_original`: version currently in the tree
- `info_orig_but_better_orderings`: a version that just tries to change the original code's orderings from SeqCst to the (probably) minimal orderings required for soundness/correctness.
The biggest concern I have here is if we can use AtomicU64, or if there are targets that dont have it that this code supports. AFAICT: no. (If that changes in the future, it's easy enough to do something different for them)
r? `@Amanieu` because he caught a couple issues last time I tried to do a patch reducing orderings 😅
---
<details>
<summary>I rewrote this whole message so the original is inside here</summary>
I happened to notice the code we use for caching the result of mach_timebase_info uses SeqCst exclusively.
However, thinking a little more, it's actually pretty easy to avoid the static mut by packing the timebase info into an AtomicU64.
This entirely avoids needing to do the compare_exchange. The AtomicU64 can be read/written using Relaxed ops, which on current macos/ios platforms (x86_64/aarch64) have no overhead compared to direct loads/stores. This simplifies the code and makes it a lot safer too.
I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), although it should do that on both targets it applies to.
That said, it also removes a bunch of unsafe code and simplifies the logic (arguably at least — there are only two states now, initialized or not), so I think it's a net win even without concrete numbers.
If you want to take a look at the assembly output though, see below. It has the new version, the original, and a version of the original with lower Orderings (which is still worse than the version in this PR)
- godbolt.org/z/obfqf9 x86_64-apple-darwin
- godbolt.org/z/Wz5cWc aarch64-unknown-linux-gnu (godbolt can't do aarch64-apple-ios but that doesn't matter here)
A different (and more efficient) option than this would be to just use the AtomicU64 and use the knowledge that after initialization the denominator should be nonzero... That felt like it's relying on too many things I'm not confident in, so I didn't want to do that.
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Clarify memory ordering and spurious wakeups.
|
|
|
|
|
|
|
|
Unbox mutexes and condvars on some platforms
Both mutexes and condition variables contained a Box containing the actual os-specific object. This was done because moving these objects may cause undefined behaviour on some platforms.
However, this is not needed on Windows[1], Wasm[2], cloudabi[2], and 'unsupported'[3], were the box was only needlessly making them less efficient.
This change gets rid of the box on those platforms.
On those platforms, `Condvar` can no longer verify it is only used with one `Mutex`, as mutexes no longer have a stable address. This was addressed and considered acceptable in #76932.
[1]\: https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initializesrwlock
[2]\: These are just a single atomic integer together with futex wait/wake calls/instructions.
[3]\: The `unsupported` platform doesn't support multiple threads at all.
|
|
Add accessors to Command.
This adds some accessor methods to `Command` to provide a way to access the values set when building the `Command`. An example where this can be useful is to display the command to be executed. This is roughly based on the [`ProcessBuilder`](https://github.com/rust-lang/cargo/blob/13b73cdaf76b2d9182515c9cf26a8f68342d08ef/src/cargo/util/process_builder.rs#L105-L134) in Cargo.
Possible concerns about the API:
- Values with NULs on Unix will be returned as `"<string-with-nul>"`. I don't think it is practical to avoid this, since otherwise a whole separate copy of all the values would need to be kept in `Command`.
- Does not handle `arg0` on Unix. This can be awkward to support in `get_args` and is rarely used. I figure if someone really wants it, it can be added to `CommandExt` as a separate method.
- Does not offer a way to detect `env_clear`. I'm uncertain if it would be useful for anyone.
- Does not offer a way to get an environment variable by name (`get_env`). I figure this can be added later if anyone really wants it. I think the motivation for this is weak, though. Also, the API could be a little awkward (return a `Option<Option<&OsStr>>`?).
- `get_envs` could skip "cleared" entries and just return `&OsStr` values instead of `Option<&OsStr>`. I'm on the fence here. My use case is to display a shell command, and I only intend it to be roughly equivalent to the actual execution, and I probably won't display `None` entries. I erred on the side of providing extra information, but I suspect many situations will just filter out the `None`s.
- Could implement more iterator stuff (like `DoubleEndedIterator`).
I have not implemented new std items before, so I'm uncertain if the existing issue should be reused, or if a new tracking issue is needed.
cc #44434
|
|
Windows condition variables are movable (while not borrowed) according
to their documentation.
|
|
This commit keeps all condvars boxed on all platforms, but makes it
trivial to remove the box on some platforms later.
|
|
Windows SRW locks are movable (while not borrowed) according to their
documentation.
|
|
This commit keeps all mutexes boxed on all platforms, but makes it
trivial to remove the box on some platforms later.
|
|
|
|
|
|
|
|
- Module name can now be any string, not just an ident.
(Not all Windows api modules are valid Rust identifiers.)
- Adds c::FuncName::is_available() for checking if a function is really
available without having to do a duplicate lookup.
- Add comment explaining the lack of locking.
- Use `$_:block` to simplify the macro_rules.
- Apply allow(unused_variables) only to the fallback instead of
everything.
|
|
|
|
|
|
|
|
- Move `held` into the boxed part, since the SRW lock implementation
does not use this. This makes the Mutex 50% smaller.
- Use `Cell` instead of `UnsafeCell` for `held`, such that `.replace()`
can be used.
- Add some comments.
|
|
Adds missing descriptions for the modules std::os::linux::fs and std::os::windows::io.
Also adds punctuation for consistency with other descriptions.
|
|
|
|
Also doing fmt inplace as requested.
|
|
|
|
|
|
|
|
|
|
Call into fastfail on abort in libpanic_abort on Windows x86(_64)
This partially resolves #73215 though this is only for x86 targets. This code is directly lifted from [libstd](https://github.com/rust-lang/rust/blob/13290e83a6e20f3b408d177a9d64d8cf98fe4615/library/std/src/sys/windows/mod.rs#L315). `__fastfail` is the preferred way to abort a process on Windows as it will hook into debugger toolchains.
Other platforms expose a `_rust_abort` symbol which wraps `std::sys::abort_internal`. This would also work on Windows, but is a slightly largely change as we'd need to make sure that the symbol is properly exposed to the linker. I'm inlining the call to the `__fastfail`, but the indirection through `rust_abort` might be a cleaner approach.
A different instruction must be used on ARM architectures. I'd like to verify this works first before tackling ARM.
|
|
|