| Age | Commit message (Collapse) | Author | Lines |
|
r=Kobzol,madsmtm
Demote x86_64-apple-darwin to Tier 2 with host tools
Switch to only using aarch64 runners (implying we are now cross-compiling) and stop running tests. In the future, we could enable (some?) tests via Rosetta 2.
This implements the decision from https://github.com/rust-lang/rfcs/pull/3841.
|
|
run spellcheck as a tidy extra check in ci
This is probably how it should've been done from the start.
r? ``@Kobzol``
|
|
r=lolbinarycat,GuillaumeGomez
rustdoc-search: search backend with partitioned suffix tree
Before:
- https://notriddle.com/windows-docs-rs/doc-old/windows/
- https://doc.rust-lang.org/1.89.0/std/index.html
- https://doc.rust-lang.org/1.89.0/nightly-rustc/rustc_hir/index.html
After:
- https://notriddle.com/windows-docs-rs/doc/windows/
- https://notriddle.com/rustdoc-html-demo-12/stringdex/doc/std/index.html
- https://notriddle.com/rustdoc-html-demo-12/stringdex/compiler-doc/rustc_hir/index.html
## Summary
Rewrites the rustdoc search engine to use an indexed data structure, factored out as a crate called [stringdex](https://crates.io/crates/stringdex), that allows it to perform modified-levenshtein distance calculations, substring matches, and prefix matches in a reasonably efficient, and, more importantly, *incremental* algorithm.
## Motivation
Fixes https://github.com/rust-lang/rust/issues/131156
While the windows-rs crate is definitely the worst offender, I've noticed performance problems with the compiler crates as well. It makes no sense for rustdoc-search to have poor performance: it's basically a spell checker, and those have been usable since the 90's.
Stringdex is particularly designed to quickly return exact matches, to always report those matches first, and to never, ever [place new matches on top of old ones](https://web.dev/articles/cls). It also tries to yield to the event loop occasionally as it runs. This way, you can click the exactly-matched result before the rest of the search finishes running.
## Explanation
A longer description of how name search works can be found in stringdex's [HACKING.md](https://gitlab.com/notriddle/stringdex/-/blob/main/HACKING.md) file.
Type search is done by performing a name search on each element, then performing bitmap operations to narrow down a list of potential matches, then performing type unification on each match.
## Drawbacks
It's rather complex, and takes up more disk space than the current flat list of strings.
## Rationale and alternatives
Instead of a suffix tree, I could've used a different [approximate matching data structure](https://en.wikipedia.org/wiki/Approximate_string_matching). I didn't do that because I wanted to keep the current behavior (for example, a straightforward trigram index won't match [oepn](https://doc.rust-lang.org/nightly/std/?search=oepn) like the current system does).
## Prior art
[Sherlodoc](https://github.com/art-w/sherlodoc) is based on a similar concept, but they:
- use edit distance over a suffix tree for type-based search, instead of the binary matching that's implemented here
- use substring matching for name-based search, [but not fuzzy name matching](https://github.com/art-w/sherlodoc/issues/21)
- actually implement body text search, which is a potential-future feature, but not implemented in this PR
## Future possibilities
### Low-level optimization in stringdex
There are half a dozen low-level optimizations that I still need to explore. I haven't done them yet, because I've been working on bug fixes and rebasing on rustdoc's side, and a more solid and diverse test suite for stringdex itself.
- Stringdex decides whether to bundle two nodes into the same file based on size. To figure out a node's size, I have to run compression on it. This is probably slower than it needs to be.
- Stack compression is limited to the same 256-slot sliding windows as backref compression, and it doesn't have to be. (stack and backref compression are used to optimize the representation of the edge pointer from a parent node to its child; backref uses one byte, while stack is entirely implicit)
- The JS-side decoder is pretty naive. It performs unnecessary hash table lookups when decoding compressed nodes, and retains a list of hashes that it doesn't need. It needs to calculate the hashes in order to construct the merkle tree correctly, but it doesn't need to keep them.
- Data compression happens at the end, while emitting the node. This means it's not being counted when deciding on how to bundle, which is pretty dumb.
### Improved recall in type-driven search
Right now, type-driven search performs very strict matching. It's very precise, but misses a lot of things people would want.
What I'm not sure about is whether to focus more on edit-distance-based approaches, or to focus on type-theoretical approaches. Both gives avenues to improve, but edit distance is going to be faster while type checking is going to be more precise.
For example, a type theoretical improvement would fix [`Iterator<T>, (T -> U) -> Iterator<U>`](https://doc.rust-lang.org/nightly/std/?search=Iterator%3CT%3E%2C%20(T%20-%3E%20U)%20-%3E%20Iterator%3CU%3E) to give [`Iterator::map`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.map), because it would recognize that the Map struct implements the Iterator trait. I don't know of any clean way to get this result to work without implementing significant type checking logic in search.js, and an edit-distance-based "dirty" approach would likely give a bunch of other results on top of this one.
## Full-text search
Once you've got this fuzzy dictionary matching to work, the logical next step is to implement some kind of information retrieval-based approach to phrase matching.
Like applying edit distance to types, phrase search gets you significantly better recall, but with a few major drawbacks:
- You have to pick between index bloat and the use of stopwords. Stopwords are bad because they might actually be important (try searching "if let" in mdBook if you're feeling brave), but without them, you spend a lot of space on text that doesn't matter.
- Example code also tends to have a lot of irrelevant stuff in it. Like stop words, we'd have to pick potentially-confusing or bloat.
Neither of these problems are deal-breakers, but they're worth keeping in mind.
|
|
|
|
|
|
|
|
Switch to only using aarch64 runners (implying we are now
cross-compiling) and stop running tests. In the future, we could
enable (some?) tests via Rosetta 2.
|
|
ci: clean windows disk space in background
|
|
|
|
Enforce in bootstrap that clippy must have stage at least 1
This mostly piggybacks on the previous `x check` [rework](https://github.com/rust-lang/rust/pull/143048).
The new "rules" follow the new staging logic. So `x clippy <foo>` lints `foo` using stage0 Clippy. `x clippy --stage 2 <foo>` lints `foo` using stage1 Clippy (which is built from in-tree sources).
I had to fix some latent issues with `prepare_compiler_for_check` along the way.
Checking `rustc_private` tools should now check less compiler crates (or rather not check compiler examples/tests/etc.), potentially speeding it up slightly.
I also had to make some manual adjustments to `x clippy ci` so that it doesn't do needless work.
r? `@jieyouxu`
|
|
Update to LLVM 21.1.0 rc3
|
|
|
|
|
|
Consolidate stage directories and group logs in bootstrap
My post-stage-0-redesign bootstrap fixes aren't done yet, but I think that enough steps have been migrated to the new system that it makes sense to actually modify the directories on disk, and what gets printed when bootstrap runs, so that it actually corresponds to the new system. Before, the printed stages didn't always make sense.
This PR:
- Fixes the numbering of `stageN` directories in the build directory. It was not corresponding to the correct stages before; notice that I did not modify `bootstrap/README.md`, as it was essentially describing what happens after this PR (first commit).
- Unifies all steps that output a build group to use the `Builder::msg` method. It's probably not the final stage, and some of the test steps might not be fully accurate yet, because I didn't fix test step numbering yet, but I think that it's a clear improvement from before, and now that everything uses the same method, we can easily make changes across the board, to ensure that it stays unified (second commit).
r? `@jieyouxu`
try-job: dist-x86_64-msvc
try-job: dist-x86_64-linux
|
|
|
|
Document compiler and stdlib in stage1 in `pr-check-2` CI job
This restores the original behavior pre-https://github.com/rust-lang/rust/pull/145011 (I thought that stage 2 makes more sense here, but it made the job ~30m slower, which is bad).
Let's see what will be the "new" duration, it should be ~55 minutes.
r? ```````@jieyouxu```````
|
|
|
|
Enable RUST_BACKTRACE=1 on CI
We should really see the backtrace if something in bootstrap fails on CI. https://github.com/rust-lang/rust/pull/145011#issuecomment-3172664934 doesn't show many details.
|
|
We should really see the backtrace if something in bootstrap fails on CI.
|
|
As they were previously.
|
|
|
|
The project build for compiler-rt is deprecated.
The runtimes build will use the just-built clang. As such, we
also need to pass --gcc-toolchain to the runtimes build, so that
it can find the GCC installation.
|
|
Necessary to avoid a bolt-related crash.
|
|
Remove install Rust script from CI
Windows ARM images should contain Rust now (https://github.com/actions/partner-runner-images/issues/77#issuecomment-3082613685).
CC dpaoliello
try-job: `*aarch64-msvc*`
|
|
Reject running `compiletest` self-tests against stage 0 rustc unless explicitly allowed
Currently, in `pr-check-1`, we run `python3 ../x.py test --stage 0 src/tools/compiletest`, which is `compiletest` self-tests against stage 0 rustc. This makes it very annoying for PRs wanting to change target spec JSON format, which `compiletest` depends on for target information, as otherwise `compiletest` would have to know how to handle 2 different target spec JSON formats and know when to pick which.
Instead of doing that, we change `compiletest` self-tests to reject running against stage 0 `rustc` *unless* explicitly allowed with `build.compiletest-allow-stage0=true`. `build.compiletest-allow-stage0` is a proper bootstrap config option in favor of the ad-hoc `COMPILETEST_FORCE_STAGE0` env var. This means that:
- `./x test src/tools/compiletest --stage=0` is not allowed, unless `build.compiletest-allow-stage0=true` is set. In this scenario, `compiletest` self-tests should be expected to fail unless the stage 0 `rustc` as provided is like codegen_cranelift where it's *actually* built from in-tree `rustc` sources.
- In CI, we change `./x test src/tools/compiletest --stage=0` to `./x test src/tools/compiletest --stage=1`, and move it to `pr-check-2`. Yes, this involves building the stage 1 compiler, but `pr-check-2` already has to build stage 1 compiler to test stage 1 library crates.
- Crucially, this means that **`compiletest` is only intended to support one target spec JSON format**, namely the one corresponding to the in-tree `rustc`.
- This should preserve the `compiletest-use-stage0-libtest` UX optimization where changing `compiler/` tree should still not require rebuilding `compiletest` as long as `build.compiletest-use-stage0-libtest=true`, as that should remain orthogonal. This is completely unlike my previous attempt at https://github.com/rust-lang/rust/pull/144563 that tries to do a way more invasive change which would cause the rebuild problem.
Best reviewed commit-by-commit.
---
r? `@Kobzol`
|
|
And move `./x test compiletest --stage=1` to `pr-check-2`, where testing
library artifacts already requires building the stage 1 compiler.
|
|
Free disk space on Windows 2025 runners
I've managed to reduce the time deletion takes by:
- Using powershell, which is generally faster for filesystem operations than msys2
- Performing deletions concurrently then waiting for them all to complete
It still takes 2-10 mins but that's not too bad.
|
|
|
|
Update wasi-sdk to 27.0 in CI
This updates the wasi-sdk used in CI to build release binaries and run CI with. No major motivation beyond keeping things up-to-date and following the development of wasi-sdk.
|
|
This updates the wasi-sdk used in CI to build release binaries and run
CI with. No major motivation beyond keeping things up-to-date and
following the development of wasi-sdk.
|
|
r=Mark-Simulacrum
Move dist-apple-various from x86_64 to aarch64
`macos-13` is going away soonish.
|
|
move uefi test to run-make
Turn the `uefi` test into a more standard `run-make` test, and execute it using the `test-various` CI job like before.
This is just a straightforward translation of the python code, but using `run-make` to supply the target (hence the 3 separate calls in the docker file).
r? ```@jieyouxu```
cc ```@nicholasbishop```
try-job: test-various
|
|
|
|
This check is relatively cheap, and is a quick way to root out breaking
check flow of `bootstrap` itself.
|
|
Enforce that PR CI jobs are a subset of Auto CI jobs modulo carve-outs
### Background
Currently, it is possible for a PR with red PR-only CI to pass Auto CI, then all subsequent PR CI runs will be red until that is fixed, even in completely unrelated PRs. For instance, this happened with PR-CI-only Spellcheck (rust-lang/rust#144183).
See more discussions at [#t-infra > Spellcheck workflow now fails on all PRs (tree bad?)](https://rust-lang.zulipchat.com/#narrow/channel/242791-t-infra/topic/Spellcheck.20workflow.20now.20fails.20on.20all.20PRs.20.28tree.20bad.3F.29/with/529769404).
### CI invariant: PR CI jobs are a subset of Auto CI jobs modulo carve-outs
To prevent red PR CI in completely unrelated subsequent PRs and PR CI runs, we need to maintain an invariant that **PR CI jobs are a subset of Auto CI jobs modulo carve-outs**.
This is **not** a "strict" subset relationship: some jobs necessarily have to differ under PR CI and Auto CI environments, at least in the current setup. Still, we can try to enforce a weaker "subset modulo carve-outs" relationship between CI jobs and their corresponding Auto jobs. For instance:
- `x86_64-gnu-tools` will have `auto`-only env vars like `DEPLOY_TOOLSTATES_JSON: toolstates-linux.json`.
- `tidy` will want to `continue_on_error: true` in PR CI to allow for more "useful" compilation errors to also be reported, whereas it should be `continue_on_error: false` in Auto CI to prevent wasting Auto CI resources.
The **carve-outs** are:
1. `env` variables.
2. `continue_on_error`.
We enforce this invariant through `citool`, so only affects job definitions that are handled by `citool`. Notably, this is not sufficient *alone* to address the CI-only Spellcheck issue (rust-lang/rust#144183). To carry out this enforcement, we modify `citool` to auto-register PR jobs as Auto jobs with `continue_on_error` overridden to `false` **unless** there's an overriding Auto job for the PR job of the same name that only differs by the permitted **carve-outs**.
### Addressing the Spellcheck PR-only CI issue
Note that Spellcheck currently does not go through `citool` or `bootstrap`, and is its own GitHub Actions workflow. To actually address the PR-CI-only Spellcheck issue (rust-lang/rust#144183), and carry out the subset-modulo-carve-outs enforcement universally, this PR additionally **removes the current Spellcheck implementation** (a separate GitHub Actions Workflow). That is incompatible with Homu unless we do some hacks in the main CI workflow.
This effectively partially reverts rust-lang/rust#134006 (the separate workflow part, not the tidy extra checks component), but is not prejudice against relanding the `typos`-based spellcheck in another implementation that goes through the usual bootstrap CI workflow so that it does work with Homu. The `typos`-based spellcheck seems to have a good false-positive rate.
Closes rust-lang/rust#144183.
---
r? infra-ci
|
|
`macos-13` is going away soonish.
|
|
Rename `tests/{assembly,codegen}` into `tests/{assembly,codegen}-llvm` and ignore these testsuites if configured backend doesn't match
Follow-up of https://github.com/rust-lang/rust/pull/144125.
This PR changes `compiletest` so that `asm` tests are only run if they match the current codegen backend. To better reflect it, I renamed the `tests/ui/asm` folder into `tests/ui/asm-llvm`. Like that, we can add new asm tests for other backends if we want without needing to add extra code to `compiletest`.
Next step will be to use the new code annotations added in rust-lang/rust#144125 to ignore ui tests failing in cg_gcc until it's fixed on our side.
cc `@antoyo` `@oli-obk`
r? `@Kobzol`
|
|
|
|
|
|
tidy: move rustdoc js stuff into a tidy extra check
Most of these were factored out of CI scripts, but `eslint` in particular was previously implemented with its own special cased logic.
A new option has been added to bootstrap, `build.tidy-extra-checks`, which serves as a default value for the `--extra-checks` flag. This is mostly for the benefit of rustdoc js maintainers, but should also help bootstrap py maintainers.
Additionally, `--extra-checks=cpp` has been documented.
I'm not super happy with how long the extra check names are in comparison to the others (in particular `typecheck`), but I couldn't think of anything better (I didn't want to name it `tsc` on the off chance we want to switch to a different typechecking engine in the future).
It would be nice to convert the extra checks arg into a proper enum, both for warning on unknown values and to provide better shell completion.
r? ```@GuillaumeGomez```
Fixes: https://github.com/rust-lang/rust/issues/144093
|
|
|
|
To prevent possibility of a PR with red PR-only CI passing Auto CI, then
all subsequent PR CI runs will be red until that is fixed.
Note that this is **not** a "strict" subset relationship: some jobs
necessarily have to differ under PR CI and Auto CI environments. For
instance:
- `x86_64-gnu-tools` will have auto-only env vars like
`DEPLOY_TOOLSTATES_JSON: toolstates-linux.json`.
- `tidy` will want to `continue_on_error: true` in PR CI to allow for
more "useful" compilation errors to also be reported, whereas it needs
to `continue_on_error: false` in Auto CI to prevent wasting Auto CI
resources.
The carve-outs are:
1. `env` variables.
2. `continue_on_error`.
|
|
Initialize mingw for the runner's user
This is apparently the more proper fix to https://rust-lang.zulipchat.com/#narrow/channel/242791-t-infra/topic/Spurious.20bors.20CI.20failures/near/528915775
But let's see if it works.
|
|
|
|
this makes us less vulnerable to MITM and supply chain attacks.
it also means that the CI scripts are no longer responsible for
tracking the versions of these tools.
it should also avoid the situation where local tsc and CI
disagree on the presense of errors due to them being different versions.
|
|
|
|
|
|
|
|
Windows ARM images should contain Rust now.
|
|
Fix handling of SCRIPT_ARG in docker images
Instead of making this a build parameter, pass the SCRIPT as an environment variable.
To this purpose, normalize on always referring to a script in `/scripts`.
For i686-gnu-nopt-2 I had to create a separate script, because Docker seems to be really terrible at command line argument parsing, so it's not possible to pass an environment variable that contains whitespace.
Fixes https://github.com/rust-lang/rust/issues/143962.
try-job: `dist-x86_64-linux`
try-job: `i686-gnu-nopt-*`
try-job: `i686-gnu-*`
try-job: `x86_64-gnu-llvm-19-*`
try-job: `x86_64-gnu-llvm-20-*`
|