From 6dbac3f09e67c853f343df9d75a7eb213f16c959 Mon Sep 17 00:00:00 2001 From: Jed Brown Date: Wed, 26 Feb 2025 20:06:25 -0700 Subject: add nvptx_target_feature Add target features for sm_* and ptx*, both of which form a partial order, but cannot be combined to a single partial order. These mirror the LLVM target features, but we do not provide LLVM target processors (which imply both an sm_* and ptx* feature). Add some documentation for the nvptx target. --- .../src/platform-support/nvptx64-nvidia-cuda.md | 40 ++++++++++++++++++++++ 1 file changed, 40 insertions(+) (limited to 'src') diff --git a/src/doc/rustc/src/platform-support/nvptx64-nvidia-cuda.md b/src/doc/rustc/src/platform-support/nvptx64-nvidia-cuda.md index 106ec562bfc..36598982481 100644 --- a/src/doc/rustc/src/platform-support/nvptx64-nvidia-cuda.md +++ b/src/doc/rustc/src/platform-support/nvptx64-nvidia-cuda.md @@ -10,6 +10,46 @@ platform. [@RDambrosio016](https://github.com/RDambrosio016) [@kjetilkjeka](https://github.com/kjetilkjeka) +## Requirements + +This target is `no_std` and will typically be built with crate-type `cdylib` and `-C linker-flavor=llbc`, which generates PTX. +The necessary components for this workflow are: + +- `rustup toolchain add nightly` +- `rustup component add llvm-tools --toolchain nightly` +- `rustup component add llvm-bitcode-linker --toolchain nightly` + +There are two options for using the core library: + +- `rustup component add rust-src --toolchain nightly` and build using `-Z build-std=core`. +- `rustup target add nvptx64-nvidia-cuda --toolchain nightly` + +### Target and features + +It is generally necessary to specify the target, such as `-C target-cpu=sm_89`, because the default is very old. This implies two target features: `sm_89` and `ptx78` (and all preceding features within `sm_*` and `ptx*`). Rust will default to using the oldest PTX version that supports the target processor (see [this table](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes-ptx-release-history)), which maximizes driver compatibility. +One can use `-C target-feature=+ptx80` to choose a later PTX version without changing the target (the default in this case, `ptx78`, requires CUDA driver version 11.8, while `ptx80` would require driver version 12.0). +Later PTX versions may allow more efficient code generation. + +Although Rust follows LLVM in representing `ptx*` and `sm_*` as target features, they should be thought of as having crate granularity, set via (either via `-Ctarget-cpu` and optionally `-Ctarget-feature`). +While the compiler accepts `#[target_feature(enable = "ptx80", enable = "sm_89")]`, it is not supported, may not behave as intended, and may become erroneous in the future. + +## Building Rust kernels + +A `no_std` crate containing one or more functions with `extern "ptx-kernel"` can be compiled to PTX using a command like the following. + +```console +$ RUSTFLAGS='-Ctarget-cpu=sm_89' cargo +nightly rustc --target=nvptx64-nvidia-cuda -Zbuild-std=core --crate-type=cdylib -- -Clinker-flavor=llbc -Zunstable-options +``` + +Intrinsics in `core::arch::nvptx` may use `#[cfg(target_feature = "...")]`, thus it's necessary to use `-Zbuild-std=core` with appropriate `RUSTFLAGS`. The following components are needed for this workflow: + +```console +$ rustup component add rust-src --toolchain nightly +$ rustup component add llvm-tools --toolchain nightly +$ rustup component add llvm-bitcode-linker --toolchain nightly +``` + +