about summary refs log tree commit diff
path: root/src/bootstrap
diff options
context:
space:
mode:
authorbors <bors@rust-lang.org>2021-09-06 16:01:17 +0000
committerbors <bors@rust-lang.org>2021-09-06 16:01:17 +0000
commit8ceea01bb442b9746a51b062ce25abbf46d866b2 (patch)
tree4613d9ab5edfc059350cad33b1779632e7f2bab2 /src/bootstrap
parent13db8440bbbe42870bc828d4ec3e965b38670277 (diff)
parentea8b1ffe61de8030b7887e859145c951c591a7c9 (diff)
downloadrust-8ceea01bb442b9746a51b062ce25abbf46d866b2.tar.gz
rust-8ceea01bb442b9746a51b062ce25abbf46d866b2.zip
Auto merge of #88362 - pietroalbini:bump-stage0, r=Mark-Simulacrum
Pin bootstrap checksums and add a tool to update it automatically

:warning: :warning: This is just a proactive hardening we're performing on the build system, and it's not prompted by any known compromise. If you're aware of security issues being exploited please [check out our responsible disclosure page](https://www.rust-lang.org/policies/security). :warning: :warning:

---

This PR aims to improve Rust's supply chain security by pinning the checksums of the bootstrap compiler downloaded by `x.py`, preventing a compromised `static.rust-lang.org` from affecting building the compiler. The checksums are stored in `src/stage0.json`, which replaces `src/stage0.txt`. This PR also adds a tool to automatically update the bootstrap compiler.

The changes in this PR were originally discussed in [Zulip](https://zulip-archive.rust-lang.org/stream/241545-t-release/topic/pinning.20stage0.20hashes.html).

## Potential attack

Before this PR, an attacker who wanted to compromise the bootstrap compiler would "just" need to:

1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage.
2. Upload compromised binaries and corresponding `.sha256` files to `static.rust-lang.org`.

There is no signature verification in `x.py` as we don't want the build system to depend on GPG. Also, since the checksums were not pinned inside the repository, they were downloaded from `static.rust-lang.org` too: this only protected from accidental changes in `static.rust-lang.org` that didn't change the `*.sha256` files. The attack would allow the attacker to compromise past and future invocations of `x.py`.

## Mitigations introduced in this PR

This PR adds pinned checksums for all the bootstrap components in `src/stage0.json` instead of downloading the checksums from `static.rust-lang.org`. This changes the attack scenario to:

1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage.
2. Upload compromised binaries to `static.rust-lang.org`.
3. Land a (reviewed) change in the `rust-lang/rust` repository changing the pinned hashes.

Even with a successful attack, existing clones of the Rust repository won't be affected, and once the attack is detected reverting the pinned hashes changes should be enough to be protected from the attack. This also enables further mitigations to be implemented in following PRs, such as verifying signatures when pinning new checksums (removing the trust on first use aspect of this PR) and adding a check in CI making sure a PR updating the checksum has not been tampered with (see the future improvements section).

## Additional changes

There are additional changes implemented in this PR to enable the mitigation:

* The `src/stage0.txt` file has been replaced with `src/stage0.json`. The reasoning for the change is that there is existing tooling to read and manipulate JSON files compared to the custom format we were using before, and the slight challenge of manually editing JSON files (no comments, no trailing commas) are not a problem thanks to the new `bump-stage0`.

* A new tool has been added to the repository, `bump-stage0`. When invoked, the tool automatically calculates which release should be used as the bootstrap compiler given the current version and channel, gathers all the relevant checksums and updates `src/stage0.json`. The tool can be invoked by running:

  ```
  ./x.py run src/tools/bump-stage0
  ```

* Support for downloading releases from `https://dev-static.rust-lang.org` has been removed, as it's not possible to verify checksums there (it's customary to replace existing artifacts there if a rebuild is warranted). This will require a change to the release process to avoid bumping the bootstrap compiler on beta before the stable release.

## Future improvements

* Add signature verification as part of `bump-stage0`, which would require the attacker to also obtain the release signing keys in order to successfully compromise the bootstrap compiler. This would be fine to add now, as the burden of installing the tool to verify signatures would only be placed on whoever updates the bootstrap compiler, instead of everyone compiling Rust.

* Add a check on CI that ensures the checksums in `src/stage0.json` are the expected ones. If a PR changes the stage0 file CI should also run the `bump-stage0` tool and fail if the output in CI doesn't match the committed file. This prevents the PR author from tweaking the output of the tool manually, which would otherwise be close to impossible for a human to detect.

* Automate creating the PRs bumping the bootstrap compiler, by setting up a scheduled job in GitHub Actions that runs the tool and opens a PR.

* Investigate whether a similar mitigation can be done for "download from CI" components like the prebuilt LLVM.

r? `@Mark-Simulacrum`
Diffstat (limited to 'src/bootstrap')
-rw-r--r--src/bootstrap/bootstrap.py139
-rw-r--r--src/bootstrap/bootstrap_test.py29
-rw-r--r--src/bootstrap/builder.rs2
-rw-r--r--src/bootstrap/lib.rs2
-rw-r--r--src/bootstrap/run.rs21
-rw-r--r--src/bootstrap/sanity.rs12
-rw-r--r--src/bootstrap/tool.rs1
7 files changed, 99 insertions, 107 deletions
diff --git a/src/bootstrap/bootstrap.py b/src/bootstrap/bootstrap.py
index e34768bc2c9..1f1eca1c76c 100644
--- a/src/bootstrap/bootstrap.py
+++ b/src/bootstrap/bootstrap.py
@@ -4,6 +4,7 @@ import contextlib
 import datetime
 import distutils.version
 import hashlib
+import json
 import os
 import re
 import shutil
@@ -24,19 +25,17 @@ def support_xz():
     except tarfile.CompressionError:
         return False
 
-def get(url, path, verbose=False, do_verify=True):
-    suffix = '.sha256'
-    sha_url = url + suffix
+def get(base, url, path, checksums, verbose=False, do_verify=True):
     with tempfile.NamedTemporaryFile(delete=False) as temp_file:
         temp_path = temp_file.name
-    with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as sha_file:
-        sha_path = sha_file.name
 
     try:
         if do_verify:
-            download(sha_path, sha_url, False, verbose)
+            if url not in checksums:
+                raise RuntimeError("src/stage0.json doesn't contain a checksum for {}".format(url))
+            sha256 = checksums[url]
             if os.path.exists(path):
-                if verify(path, sha_path, False):
+                if verify(path, sha256, False):
                     if verbose:
                         print("using already-download file", path)
                     return
@@ -45,23 +44,17 @@ def get(url, path, verbose=False, do_verify=True):
                         print("ignoring already-download file",
                             path, "due to failed verification")
                     os.unlink(path)
-        download(temp_path, url, True, verbose)
-        if do_verify and not verify(temp_path, sha_path, verbose):
+        download(temp_path, "{}/{}".format(base, url), True, verbose)
+        if do_verify and not verify(temp_path, sha256, verbose):
             raise RuntimeError("failed verification")
         if verbose:
             print("moving {} to {}".format(temp_path, path))
         shutil.move(temp_path, path)
     finally:
-        delete_if_present(sha_path, verbose)
-        delete_if_present(temp_path, verbose)
-
-
-def delete_if_present(path, verbose):
-    """Remove the given file if present"""
-    if os.path.isfile(path):
-        if verbose:
-            print("removing", path)
-        os.unlink(path)
+        if os.path.isfile(temp_path):
+            if verbose:
+                print("removing", temp_path)
+            os.unlink(temp_path)
 
 
 def download(path, url, probably_big, verbose):
@@ -98,14 +91,12 @@ def _download(path, url, probably_big, verbose, exception):
             exception=exception)
 
 
-def verify(path, sha_path, verbose):
+def verify(path, expected, verbose):
     """Check if the sha256 sum of the given path is valid"""
     if verbose:
         print("verifying", path)
     with open(path, "rb") as source:
         found = hashlib.sha256(source.read()).hexdigest()
-    with open(sha_path, "r") as sha256sum:
-        expected = sha256sum.readline().split()[0]
     verified = found == expected
     if not verified:
         print("invalid checksum:\n"
@@ -176,15 +167,6 @@ def require(cmd, exit=True):
         sys.exit(1)
 
 
-def stage0_data(rust_root):
-    """Build a dictionary from stage0.txt"""
-    nightlies = os.path.join(rust_root, "src/stage0.txt")
-    with open(nightlies, 'r') as nightlies:
-        lines = [line.rstrip() for line in nightlies
-                 if not line.startswith("#")]
-        return dict([line.split(": ", 1) for line in lines if line])
-
-
 def format_build_time(duration):
     """Return a nicer format for build time
 
@@ -372,13 +354,22 @@ def output(filepath):
     os.rename(tmp, filepath)
 
 
+class Stage0Toolchain:
+    def __init__(self, stage0_payload):
+        self.date = stage0_payload["date"]
+        self.version = stage0_payload["version"]
+
+    def channel(self):
+        return self.version + "-" + self.date
+
+
 class RustBuild(object):
     """Provide all the methods required to build Rust"""
     def __init__(self):
-        self.date = ''
+        self.checksums_sha256 = {}
+        self.stage0_compiler = None
+        self.stage0_rustfmt = None
         self._download_url = ''
-        self.rustc_channel = ''
-        self.rustfmt_channel = ''
         self.build = ''
         self.build_dir = ''
         self.clean = False
@@ -402,11 +393,10 @@ class RustBuild(object):
         will move all the content to the right place.
         """
         if rustc_channel is None:
-            rustc_channel = self.rustc_channel
-        rustfmt_channel = self.rustfmt_channel
+            rustc_channel = self.stage0_compiler.version
         bin_root = self.bin_root(stage0)
 
-        key = self.date
+        key = self.stage0_compiler.date
         if not stage0:
             key += str(self.rustc_commit)
         if self.rustc(stage0).startswith(bin_root) and \
@@ -445,19 +435,23 @@ class RustBuild(object):
 
         if self.rustfmt() and self.rustfmt().startswith(bin_root) and (
             not os.path.exists(self.rustfmt())
-            or self.program_out_of_date(self.rustfmt_stamp(), self.rustfmt_channel)
+            or self.program_out_of_date(
+                self.rustfmt_stamp(),
+                "" if self.stage0_rustfmt is None else self.stage0_rustfmt.channel()
+            )
         ):
-            if rustfmt_channel:
+            if self.stage0_rustfmt is not None:
                 tarball_suffix = '.tar.xz' if support_xz() else '.tar.gz'
-                [channel, date] = rustfmt_channel.split('-', 1)
-                filename = "rustfmt-{}-{}{}".format(channel, self.build, tarball_suffix)
+                filename = "rustfmt-{}-{}{}".format(
+                    self.stage0_rustfmt.version, self.build, tarball_suffix,
+                )
                 self._download_component_helper(
-                    filename, "rustfmt-preview", tarball_suffix, key=date
+                    filename, "rustfmt-preview", tarball_suffix, key=self.stage0_rustfmt.date
                 )
                 self.fix_bin_or_dylib("{}/bin/rustfmt".format(bin_root))
                 self.fix_bin_or_dylib("{}/bin/cargo-fmt".format(bin_root))
                 with output(self.rustfmt_stamp()) as rustfmt_stamp:
-                    rustfmt_stamp.write(self.rustfmt_channel)
+                    rustfmt_stamp.write(self.stage0_rustfmt.channel())
 
         # Avoid downloading LLVM twice (once for stage0 and once for the master rustc)
         if self.downloading_llvm() and stage0:
@@ -518,7 +512,7 @@ class RustBuild(object):
     ):
         if key is None:
             if stage0:
-                key = self.date
+                key = self.stage0_compiler.date
             else:
                 key = self.rustc_commit
         cache_dst = os.path.join(self.build_dir, "cache")
@@ -527,12 +521,21 @@ class RustBuild(object):
             os.makedirs(rustc_cache)
 
         if stage0:
-            url = "{}/dist/{}".format(self._download_url, key)
+            base = self._download_url
+            url = "dist/{}".format(key)
         else:
-            url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(self.rustc_commit)
+            base = "https://ci-artifacts.rust-lang.org"
+            url = "rustc-builds/{}".format(self.rustc_commit)
         tarball = os.path.join(rustc_cache, filename)
         if not os.path.exists(tarball):
-            get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=stage0)
+            get(
+                base,
+                "{}/{}".format(url, filename),
+                tarball,
+                self.checksums_sha256,
+                verbose=self.verbose,
+                do_verify=stage0,
+            )
         unpack(tarball, tarball_suffix, self.bin_root(stage0), match=pattern, verbose=self.verbose)
 
     def _download_ci_llvm(self, llvm_sha, llvm_assertions):
@@ -542,7 +545,8 @@ class RustBuild(object):
         if not os.path.exists(rustc_cache):
             os.makedirs(rustc_cache)
 
-        url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(llvm_sha)
+        base = "https://ci-artifacts.rust-lang.org"
+        url = "rustc-builds/{}".format(llvm_sha)
         if llvm_assertions:
             url = url.replace('rustc-builds', 'rustc-builds-alt')
         # ci-artifacts are only stored as .xz, not .gz
@@ -554,7 +558,14 @@ class RustBuild(object):
         filename = "rust-dev-nightly-" + self.build + tarball_suffix
         tarball = os.path.join(rustc_cache, filename)
         if not os.path.exists(tarball):
-            get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=False)
+            get(
+                base,
+                "{}/{}".format(url, filename),
+                tarball,
+                self.checksums_sha256,
+                verbose=self.verbose,
+                do_verify=False,
+            )
         unpack(tarball, tarball_suffix, self.llvm_root(),
                 match="rust-dev",
                 verbose=self.verbose)
@@ -816,7 +827,7 @@ class RustBuild(object):
 
     def rustfmt(self):
         """Return config path for rustfmt"""
-        if not self.rustfmt_channel:
+        if self.stage0_rustfmt is None:
             return None
         return self.program_config('rustfmt')
 
@@ -1040,19 +1051,12 @@ class RustBuild(object):
             self.update_submodule(module[0], module[1], recorded_submodules)
         print("Submodules updated in %.2f seconds" % (time() - start_time))
 
-    def set_normal_environment(self):
+    def set_dist_environment(self, url):
         """Set download URL for normal environment"""
         if 'RUSTUP_DIST_SERVER' in os.environ:
             self._download_url = os.environ['RUSTUP_DIST_SERVER']
         else:
-            self._download_url = 'https://static.rust-lang.org'
-
-    def set_dev_environment(self):
-        """Set download URL for development environment"""
-        if 'RUSTUP_DEV_DIST_SERVER' in os.environ:
-            self._download_url = os.environ['RUSTUP_DEV_DIST_SERVER']
-        else:
-            self._download_url = 'https://dev-static.rust-lang.org'
+            self._download_url = url
 
     def check_vendored_status(self):
         """Check that vendoring is configured properly"""
@@ -1161,17 +1165,14 @@ def bootstrap(help_triggered):
     build_dir = build.get_toml('build-dir', 'build') or 'build'
     build.build_dir = os.path.abspath(build_dir.replace("$ROOT", build.rust_root))
 
-    data = stage0_data(build.rust_root)
-    build.date = data['date']
-    build.rustc_channel = data['rustc']
-
-    if "rustfmt" in data:
-        build.rustfmt_channel = data['rustfmt']
+    with open(os.path.join(build.rust_root, "src", "stage0.json")) as f:
+        data = json.load(f)
+    build.checksums_sha256 = data["checksums_sha256"]
+    build.stage0_compiler = Stage0Toolchain(data["compiler"])
+    if data.get("rustfmt") is not None:
+        build.stage0_rustfmt = Stage0Toolchain(data["rustfmt"])
 
-    if 'dev' in data:
-        build.set_dev_environment()
-    else:
-        build.set_normal_environment()
+    build.set_dist_environment(data["dist_server"])
 
     build.build = args.build or build.build_triple()
     build.update_submodules()
diff --git a/src/bootstrap/bootstrap_test.py b/src/bootstrap/bootstrap_test.py
index 61507114159..7bffc1c1520 100644
--- a/src/bootstrap/bootstrap_test.py
+++ b/src/bootstrap/bootstrap_test.py
@@ -13,38 +13,18 @@ from shutil import rmtree
 import bootstrap
 
 
-class Stage0DataTestCase(unittest.TestCase):
-    """Test Case for stage0_data"""
-    def setUp(self):
-        self.rust_root = tempfile.mkdtemp()
-        os.mkdir(os.path.join(self.rust_root, "src"))
-        with open(os.path.join(self.rust_root, "src",
-                               "stage0.txt"), "w") as stage0:
-            stage0.write("#ignore\n\ndate: 2017-06-15\nrustc: beta\ncargo: beta\nrustfmt: beta")
-
-    def tearDown(self):
-        rmtree(self.rust_root)
-
-    def test_stage0_data(self):
-        """Extract data from stage0.txt"""
-        expected = {"date": "2017-06-15", "rustc": "beta", "cargo": "beta", "rustfmt": "beta"}
-        data = bootstrap.stage0_data(self.rust_root)
-        self.assertDictEqual(data, expected)
-
-
 class VerifyTestCase(unittest.TestCase):
     """Test Case for verify"""
     def setUp(self):
         self.container = tempfile.mkdtemp()
         self.src = os.path.join(self.container, "src.txt")
-        self.sums = os.path.join(self.container, "sums")
         self.bad_src = os.path.join(self.container, "bad.txt")
         content = "Hello world"
 
+        self.expected = hashlib.sha256(content.encode("utf-8")).hexdigest()
+
         with open(self.src, "w") as src:
             src.write(content)
-        with open(self.sums, "w") as sums:
-            sums.write(hashlib.sha256(content.encode("utf-8")).hexdigest())
         with open(self.bad_src, "w") as bad:
             bad.write("Hello!")
 
@@ -53,11 +33,11 @@ class VerifyTestCase(unittest.TestCase):
 
     def test_valid_file(self):
         """Check if the sha256 sum of the given file is valid"""
-        self.assertTrue(bootstrap.verify(self.src, self.sums, False))
+        self.assertTrue(bootstrap.verify(self.src, self.expected, False))
 
     def test_invalid_file(self):
         """Should verify that the file is invalid"""
-        self.assertFalse(bootstrap.verify(self.bad_src, self.sums, False))
+        self.assertFalse(bootstrap.verify(self.bad_src, self.expected, False))
 
 
 class ProgramOutOfDate(unittest.TestCase):
@@ -99,7 +79,6 @@ if __name__ == '__main__':
     TEST_LOADER = unittest.TestLoader()
     SUITE.addTest(doctest.DocTestSuite(bootstrap))
     SUITE.addTests([
-        TEST_LOADER.loadTestsFromTestCase(Stage0DataTestCase),
         TEST_LOADER.loadTestsFromTestCase(VerifyTestCase),
         TEST_LOADER.loadTestsFromTestCase(ProgramOutOfDate)])
 
diff --git a/src/bootstrap/builder.rs b/src/bootstrap/builder.rs
index 5911309a044..0a6ed2f49b7 100644
--- a/src/bootstrap/builder.rs
+++ b/src/bootstrap/builder.rs
@@ -523,7 +523,7 @@ impl<'a> Builder<'a> {
                 install::Src,
                 install::Rustc
             ),
-            Kind::Run => describe!(run::ExpandYamlAnchors, run::BuildManifest),
+            Kind::Run => describe!(run::ExpandYamlAnchors, run::BuildManifest, run::BumpStage0),
         }
     }
 
diff --git a/src/bootstrap/lib.rs b/src/bootstrap/lib.rs
index 3d56650f775..a4735d54be0 100644
--- a/src/bootstrap/lib.rs
+++ b/src/bootstrap/lib.rs
@@ -31,7 +31,7 @@
 //! When you execute `x.py build`, the steps executed are:
 //!
 //! * First, the python script is run. This will automatically download the
-//!   stage0 rustc and cargo according to `src/stage0.txt`, or use the cached
+//!   stage0 rustc and cargo according to `src/stage0.json`, or use the cached
 //!   versions if they're available. These are then used to compile rustbuild
 //!   itself (using Cargo). Finally, control is then transferred to rustbuild.
 //!
diff --git a/src/bootstrap/run.rs b/src/bootstrap/run.rs
index 7c64e5a0aad..11b393857e7 100644
--- a/src/bootstrap/run.rs
+++ b/src/bootstrap/run.rs
@@ -82,3 +82,24 @@ impl Step for BuildManifest {
         builder.run(&mut cmd);
     }
 }
+
+#[derive(Debug, PartialOrd, Ord, Copy, Clone, Hash, PartialEq, Eq)]
+pub struct BumpStage0;
+
+impl Step for BumpStage0 {
+    type Output = ();
+    const ONLY_HOSTS: bool = true;
+
+    fn should_run(run: ShouldRun<'_>) -> ShouldRun<'_> {
+        run.path("src/tools/bump-stage0")
+    }
+
+    fn make_run(run: RunConfig<'_>) {
+        run.builder.ensure(BumpStage0);
+    }
+
+    fn run(self, builder: &Builder<'_>) -> Self::Output {
+        let mut cmd = builder.tool_cmd(Tool::BumpStage0);
+        builder.run(&mut cmd);
+    }
+}
diff --git a/src/bootstrap/sanity.rs b/src/bootstrap/sanity.rs
index 74e50c60610..d7db2cef24f 100644
--- a/src/bootstrap/sanity.rs
+++ b/src/bootstrap/sanity.rs
@@ -15,7 +15,7 @@ use std::fs;
 use std::path::PathBuf;
 use std::process::Command;
 
-use build_helper::{output, t};
+use build_helper::output;
 
 use crate::cache::INTERNER;
 use crate::config::Target;
@@ -227,14 +227,4 @@ $ pacman -R cmake && pacman -S mingw-w64-x86_64-cmake
     if let Some(ref s) = build.config.ccache {
         cmd_finder.must_have(s);
     }
-
-    if build.config.channel == "stable" {
-        let stage0 = t!(fs::read_to_string(build.src.join("src/stage0.txt")));
-        if stage0.contains("\ndev:") {
-            panic!(
-                "bootstrapping from a dev compiler in a stable release, but \
-                    should only be bootstrapping from a released compiler!"
-            );
-        }
-    }
 }
diff --git a/src/bootstrap/tool.rs b/src/bootstrap/tool.rs
index f5e3f61dcc8..c0358946385 100644
--- a/src/bootstrap/tool.rs
+++ b/src/bootstrap/tool.rs
@@ -377,6 +377,7 @@ bootstrap_tool!(
     LintDocs, "src/tools/lint-docs", "lint-docs";
     JsonDocCk, "src/tools/jsondocck", "jsondocck";
     HtmlChecker, "src/tools/html-checker", "html-checker";
+    BumpStage0, "src/tools/bump-stage0", "bump-stage0";
 );
 
 #[derive(Debug, Copy, Clone, Hash, PartialEq, Eq, Ord, PartialOrd)]