about summary refs log tree commit diff
path: root/src/bootstrap/bootstrap.py
diff options
context:
space:
mode:
authorbors <bors@rust-lang.org>2021-09-06 16:01:17 +0000
committerbors <bors@rust-lang.org>2021-09-06 16:01:17 +0000
commit8ceea01bb442b9746a51b062ce25abbf46d866b2 (patch)
tree4613d9ab5edfc059350cad33b1779632e7f2bab2 /src/bootstrap/bootstrap.py
parent13db8440bbbe42870bc828d4ec3e965b38670277 (diff)
parentea8b1ffe61de8030b7887e859145c951c591a7c9 (diff)
downloadrust-8ceea01bb442b9746a51b062ce25abbf46d866b2.tar.gz
rust-8ceea01bb442b9746a51b062ce25abbf46d866b2.zip
Auto merge of #88362 - pietroalbini:bump-stage0, r=Mark-Simulacrum
Pin bootstrap checksums and add a tool to update it automatically

:warning: :warning: This is just a proactive hardening we're performing on the build system, and it's not prompted by any known compromise. If you're aware of security issues being exploited please [check out our responsible disclosure page](https://www.rust-lang.org/policies/security). :warning: :warning:

---

This PR aims to improve Rust's supply chain security by pinning the checksums of the bootstrap compiler downloaded by `x.py`, preventing a compromised `static.rust-lang.org` from affecting building the compiler. The checksums are stored in `src/stage0.json`, which replaces `src/stage0.txt`. This PR also adds a tool to automatically update the bootstrap compiler.

The changes in this PR were originally discussed in [Zulip](https://zulip-archive.rust-lang.org/stream/241545-t-release/topic/pinning.20stage0.20hashes.html).

## Potential attack

Before this PR, an attacker who wanted to compromise the bootstrap compiler would "just" need to:

1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage.
2. Upload compromised binaries and corresponding `.sha256` files to `static.rust-lang.org`.

There is no signature verification in `x.py` as we don't want the build system to depend on GPG. Also, since the checksums were not pinned inside the repository, they were downloaded from `static.rust-lang.org` too: this only protected from accidental changes in `static.rust-lang.org` that didn't change the `*.sha256` files. The attack would allow the attacker to compromise past and future invocations of `x.py`.

## Mitigations introduced in this PR

This PR adds pinned checksums for all the bootstrap components in `src/stage0.json` instead of downloading the checksums from `static.rust-lang.org`. This changes the attack scenario to:

1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage.
2. Upload compromised binaries to `static.rust-lang.org`.
3. Land a (reviewed) change in the `rust-lang/rust` repository changing the pinned hashes.

Even with a successful attack, existing clones of the Rust repository won't be affected, and once the attack is detected reverting the pinned hashes changes should be enough to be protected from the attack. This also enables further mitigations to be implemented in following PRs, such as verifying signatures when pinning new checksums (removing the trust on first use aspect of this PR) and adding a check in CI making sure a PR updating the checksum has not been tampered with (see the future improvements section).

## Additional changes

There are additional changes implemented in this PR to enable the mitigation:

* The `src/stage0.txt` file has been replaced with `src/stage0.json`. The reasoning for the change is that there is existing tooling to read and manipulate JSON files compared to the custom format we were using before, and the slight challenge of manually editing JSON files (no comments, no trailing commas) are not a problem thanks to the new `bump-stage0`.

* A new tool has been added to the repository, `bump-stage0`. When invoked, the tool automatically calculates which release should be used as the bootstrap compiler given the current version and channel, gathers all the relevant checksums and updates `src/stage0.json`. The tool can be invoked by running:

  ```
  ./x.py run src/tools/bump-stage0
  ```

* Support for downloading releases from `https://dev-static.rust-lang.org` has been removed, as it's not possible to verify checksums there (it's customary to replace existing artifacts there if a rebuild is warranted). This will require a change to the release process to avoid bumping the bootstrap compiler on beta before the stable release.

## Future improvements

* Add signature verification as part of `bump-stage0`, which would require the attacker to also obtain the release signing keys in order to successfully compromise the bootstrap compiler. This would be fine to add now, as the burden of installing the tool to verify signatures would only be placed on whoever updates the bootstrap compiler, instead of everyone compiling Rust.

* Add a check on CI that ensures the checksums in `src/stage0.json` are the expected ones. If a PR changes the stage0 file CI should also run the `bump-stage0` tool and fail if the output in CI doesn't match the committed file. This prevents the PR author from tweaking the output of the tool manually, which would otherwise be close to impossible for a human to detect.

* Automate creating the PRs bumping the bootstrap compiler, by setting up a scheduled job in GitHub Actions that runs the tool and opens a PR.

* Investigate whether a similar mitigation can be done for "download from CI" components like the prebuilt LLVM.

r? `@Mark-Simulacrum`
Diffstat (limited to 'src/bootstrap/bootstrap.py')
-rw-r--r--src/bootstrap/bootstrap.py139
1 files changed, 70 insertions, 69 deletions
diff --git a/src/bootstrap/bootstrap.py b/src/bootstrap/bootstrap.py
index e34768bc2c9..1f1eca1c76c 100644
--- a/src/bootstrap/bootstrap.py
+++ b/src/bootstrap/bootstrap.py
@@ -4,6 +4,7 @@ import contextlib
 import datetime
 import distutils.version
 import hashlib
+import json
 import os
 import re
 import shutil
@@ -24,19 +25,17 @@ def support_xz():
     except tarfile.CompressionError:
         return False
 
-def get(url, path, verbose=False, do_verify=True):
-    suffix = '.sha256'
-    sha_url = url + suffix
+def get(base, url, path, checksums, verbose=False, do_verify=True):
     with tempfile.NamedTemporaryFile(delete=False) as temp_file:
         temp_path = temp_file.name
-    with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as sha_file:
-        sha_path = sha_file.name
 
     try:
         if do_verify:
-            download(sha_path, sha_url, False, verbose)
+            if url not in checksums:
+                raise RuntimeError("src/stage0.json doesn't contain a checksum for {}".format(url))
+            sha256 = checksums[url]
             if os.path.exists(path):
-                if verify(path, sha_path, False):
+                if verify(path, sha256, False):
                     if verbose:
                         print("using already-download file", path)
                     return
@@ -45,23 +44,17 @@ def get(url, path, verbose=False, do_verify=True):
                         print("ignoring already-download file",
                             path, "due to failed verification")
                     os.unlink(path)
-        download(temp_path, url, True, verbose)
-        if do_verify and not verify(temp_path, sha_path, verbose):
+        download(temp_path, "{}/{}".format(base, url), True, verbose)
+        if do_verify and not verify(temp_path, sha256, verbose):
             raise RuntimeError("failed verification")
         if verbose:
             print("moving {} to {}".format(temp_path, path))
         shutil.move(temp_path, path)
     finally:
-        delete_if_present(sha_path, verbose)
-        delete_if_present(temp_path, verbose)
-
-
-def delete_if_present(path, verbose):
-    """Remove the given file if present"""
-    if os.path.isfile(path):
-        if verbose:
-            print("removing", path)
-        os.unlink(path)
+        if os.path.isfile(temp_path):
+            if verbose:
+                print("removing", temp_path)
+            os.unlink(temp_path)
 
 
 def download(path, url, probably_big, verbose):
@@ -98,14 +91,12 @@ def _download(path, url, probably_big, verbose, exception):
             exception=exception)
 
 
-def verify(path, sha_path, verbose):
+def verify(path, expected, verbose):
     """Check if the sha256 sum of the given path is valid"""
     if verbose:
         print("verifying", path)
     with open(path, "rb") as source:
         found = hashlib.sha256(source.read()).hexdigest()
-    with open(sha_path, "r") as sha256sum:
-        expected = sha256sum.readline().split()[0]
     verified = found == expected
     if not verified:
         print("invalid checksum:\n"
@@ -176,15 +167,6 @@ def require(cmd, exit=True):
         sys.exit(1)
 
 
-def stage0_data(rust_root):
-    """Build a dictionary from stage0.txt"""
-    nightlies = os.path.join(rust_root, "src/stage0.txt")
-    with open(nightlies, 'r') as nightlies:
-        lines = [line.rstrip() for line in nightlies
-                 if not line.startswith("#")]
-        return dict([line.split(": ", 1) for line in lines if line])
-
-
 def format_build_time(duration):
     """Return a nicer format for build time
 
@@ -372,13 +354,22 @@ def output(filepath):
     os.rename(tmp, filepath)
 
 
+class Stage0Toolchain:
+    def __init__(self, stage0_payload):
+        self.date = stage0_payload["date"]
+        self.version = stage0_payload["version"]
+
+    def channel(self):
+        return self.version + "-" + self.date
+
+
 class RustBuild(object):
     """Provide all the methods required to build Rust"""
     def __init__(self):
-        self.date = ''
+        self.checksums_sha256 = {}
+        self.stage0_compiler = None
+        self.stage0_rustfmt = None
         self._download_url = ''
-        self.rustc_channel = ''
-        self.rustfmt_channel = ''
         self.build = ''
         self.build_dir = ''
         self.clean = False
@@ -402,11 +393,10 @@ class RustBuild(object):
         will move all the content to the right place.
         """
         if rustc_channel is None:
-            rustc_channel = self.rustc_channel
-        rustfmt_channel = self.rustfmt_channel
+            rustc_channel = self.stage0_compiler.version
         bin_root = self.bin_root(stage0)
 
-        key = self.date
+        key = self.stage0_compiler.date
         if not stage0:
             key += str(self.rustc_commit)
         if self.rustc(stage0).startswith(bin_root) and \
@@ -445,19 +435,23 @@ class RustBuild(object):
 
         if self.rustfmt() and self.rustfmt().startswith(bin_root) and (
             not os.path.exists(self.rustfmt())
-            or self.program_out_of_date(self.rustfmt_stamp(), self.rustfmt_channel)
+            or self.program_out_of_date(
+                self.rustfmt_stamp(),
+                "" if self.stage0_rustfmt is None else self.stage0_rustfmt.channel()
+            )
         ):
-            if rustfmt_channel:
+            if self.stage0_rustfmt is not None:
                 tarball_suffix = '.tar.xz' if support_xz() else '.tar.gz'
-                [channel, date] = rustfmt_channel.split('-', 1)
-                filename = "rustfmt-{}-{}{}".format(channel, self.build, tarball_suffix)
+                filename = "rustfmt-{}-{}{}".format(
+                    self.stage0_rustfmt.version, self.build, tarball_suffix,
+                )
                 self._download_component_helper(
-                    filename, "rustfmt-preview", tarball_suffix, key=date
+                    filename, "rustfmt-preview", tarball_suffix, key=self.stage0_rustfmt.date
                 )
                 self.fix_bin_or_dylib("{}/bin/rustfmt".format(bin_root))
                 self.fix_bin_or_dylib("{}/bin/cargo-fmt".format(bin_root))
                 with output(self.rustfmt_stamp()) as rustfmt_stamp:
-                    rustfmt_stamp.write(self.rustfmt_channel)
+                    rustfmt_stamp.write(self.stage0_rustfmt.channel())
 
         # Avoid downloading LLVM twice (once for stage0 and once for the master rustc)
         if self.downloading_llvm() and stage0:
@@ -518,7 +512,7 @@ class RustBuild(object):
     ):
         if key is None:
             if stage0:
-                key = self.date
+                key = self.stage0_compiler.date
             else:
                 key = self.rustc_commit
         cache_dst = os.path.join(self.build_dir, "cache")
@@ -527,12 +521,21 @@ class RustBuild(object):
             os.makedirs(rustc_cache)
 
         if stage0:
-            url = "{}/dist/{}".format(self._download_url, key)
+            base = self._download_url
+            url = "dist/{}".format(key)
         else:
-            url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(self.rustc_commit)
+            base = "https://ci-artifacts.rust-lang.org"
+            url = "rustc-builds/{}".format(self.rustc_commit)
         tarball = os.path.join(rustc_cache, filename)
         if not os.path.exists(tarball):
-            get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=stage0)
+            get(
+                base,
+                "{}/{}".format(url, filename),
+                tarball,
+                self.checksums_sha256,
+                verbose=self.verbose,
+                do_verify=stage0,
+            )
         unpack(tarball, tarball_suffix, self.bin_root(stage0), match=pattern, verbose=self.verbose)
 
     def _download_ci_llvm(self, llvm_sha, llvm_assertions):
@@ -542,7 +545,8 @@ class RustBuild(object):
         if not os.path.exists(rustc_cache):
             os.makedirs(rustc_cache)
 
-        url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(llvm_sha)
+        base = "https://ci-artifacts.rust-lang.org"
+        url = "rustc-builds/{}".format(llvm_sha)
         if llvm_assertions:
             url = url.replace('rustc-builds', 'rustc-builds-alt')
         # ci-artifacts are only stored as .xz, not .gz
@@ -554,7 +558,14 @@ class RustBuild(object):
         filename = "rust-dev-nightly-" + self.build + tarball_suffix
         tarball = os.path.join(rustc_cache, filename)
         if not os.path.exists(tarball):
-            get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=False)
+            get(
+                base,
+                "{}/{}".format(url, filename),
+                tarball,
+                self.checksums_sha256,
+                verbose=self.verbose,
+                do_verify=False,
+            )
         unpack(tarball, tarball_suffix, self.llvm_root(),
                 match="rust-dev",
                 verbose=self.verbose)
@@ -816,7 +827,7 @@ class RustBuild(object):
 
     def rustfmt(self):
         """Return config path for rustfmt"""
-        if not self.rustfmt_channel:
+        if self.stage0_rustfmt is None:
             return None
         return self.program_config('rustfmt')
 
@@ -1040,19 +1051,12 @@ class RustBuild(object):
             self.update_submodule(module[0], module[1], recorded_submodules)
         print("Submodules updated in %.2f seconds" % (time() - start_time))
 
-    def set_normal_environment(self):
+    def set_dist_environment(self, url):
         """Set download URL for normal environment"""
         if 'RUSTUP_DIST_SERVER' in os.environ:
             self._download_url = os.environ['RUSTUP_DIST_SERVER']
         else:
-            self._download_url = 'https://static.rust-lang.org'
-
-    def set_dev_environment(self):
-        """Set download URL for development environment"""
-        if 'RUSTUP_DEV_DIST_SERVER' in os.environ:
-            self._download_url = os.environ['RUSTUP_DEV_DIST_SERVER']
-        else:
-            self._download_url = 'https://dev-static.rust-lang.org'
+            self._download_url = url
 
     def check_vendored_status(self):
         """Check that vendoring is configured properly"""
@@ -1161,17 +1165,14 @@ def bootstrap(help_triggered):
     build_dir = build.get_toml('build-dir', 'build') or 'build'
     build.build_dir = os.path.abspath(build_dir.replace("$ROOT", build.rust_root))
 
-    data = stage0_data(build.rust_root)
-    build.date = data['date']
-    build.rustc_channel = data['rustc']
-
-    if "rustfmt" in data:
-        build.rustfmt_channel = data['rustfmt']
+    with open(os.path.join(build.rust_root, "src", "stage0.json")) as f:
+        data = json.load(f)
+    build.checksums_sha256 = data["checksums_sha256"]
+    build.stage0_compiler = Stage0Toolchain(data["compiler"])
+    if data.get("rustfmt") is not None:
+        build.stage0_rustfmt = Stage0Toolchain(data["rustfmt"])
 
-    if 'dev' in data:
-        build.set_dev_environment()
-    else:
-        build.set_normal_environment()
+    build.set_dist_environment(data["dist_server"])
 
     build.build = args.build or build.build_triple()
     build.update_submodules()