diff options
| author | Jed Brown <jed@jedbrown.org> | 2025-05-21 22:08:51 -0600 |
|---|---|---|
| committer | Jed Brown <jed@jedbrown.org> | 2025-06-21 19:32:47 -0600 |
| commit | 35a485ddd86229101c4c17d9167f23cf75cae644 (patch) | |
| tree | 1bb5c2b878a2e97360910d03c6d191bc73cc350b /compiler/rustc_codegen_llvm/src/llvm_util.rs | |
| parent | 6dbac3f09e67c853f343df9d75a7eb213f16c959 (diff) | |
| download | rust-35a485ddd86229101c4c17d9167f23cf75cae644.tar.gz rust-35a485ddd86229101c4c17d9167f23cf75cae644.zip | |
target-feature: enable rust target features implied by target-cpu
Normally LLVM and rustc agree about what features are implied by
target-cpu, but for NVPTX, LLVM considers sm_* and ptx* features to be
exclusive, which makes sense for codegen purposes. But in Rust, we want
to think of them as:
sm_{sver} means that the target supports the hardware features of sver
ptx{pver} means the driver supports PTX ISA pver
Intrinsics usually require a minimum sm_{sver} and ptx{pver}.
Prior to this commit, -Ctarget-cpu=sm_70 would activate only sm_70 and
ptx60 (the minimum PTX version that supports sm_70, which maximizes
driver compatibility). With this commit, it also activates all the
implied target features (sm_20, ..., sm_62; ptx32, ..., ptx50).
Diffstat (limited to 'compiler/rustc_codegen_llvm/src/llvm_util.rs')
| -rw-r--r-- | compiler/rustc_codegen_llvm/src/llvm_util.rs | 9 |
1 files changed, 3 insertions, 6 deletions
diff --git a/compiler/rustc_codegen_llvm/src/llvm_util.rs b/compiler/rustc_codegen_llvm/src/llvm_util.rs index 202b9641e56..2c1882e24be 100644 --- a/compiler/rustc_codegen_llvm/src/llvm_util.rs +++ b/compiler/rustc_codegen_llvm/src/llvm_util.rs @@ -333,15 +333,12 @@ pub(crate) fn to_llvm_features<'a>(sess: &Session, s: &'a str) -> Option<LLVMFea /// /// We do not have to worry about RUSTC_SPECIFIC_FEATURES here, those are handled outside codegen. pub(crate) fn target_config(sess: &Session) -> TargetConfig { - // Add base features for the target. - // We do *not* add the -Ctarget-features there, and instead duplicate the logic for that below. - // The reason is that if LLVM considers a feature implied but we do not, we don't want that to - // show up in `cfg`. That way, `cfg` is entirely under our control -- except for the handling of - // the target CPU, that is still expanded to target features (with all their implied features) - // by LLVM. let target_machine = create_informational_target_machine(sess, true); let (unstable_target_features, target_features) = cfg_target_feature(sess, |feature| { + // This closure determines whether the target CPU has the feature according to LLVM. We do + // *not* consider the `-Ctarget-feature`s here, as that will be handled later in + // `cfg_target_feature`. if let Some(feat) = to_llvm_features(sess, feature) { // All the LLVM features this expands to must be enabled. for llvm_feature in feat { |
