diff options
| author | bors <bors@rust-lang.org> | 2025-02-13 15:27:30 +0000 |
|---|---|---|
| committer | bors <bors@rust-lang.org> | 2025-02-13 15:27:30 +0000 |
| commit | ef7aa51f1cf8d262ff5b29f3d6730c6f83fbad42 (patch) | |
| tree | fe6c6c54ceb04a23066bd26250362618dfa5bb81 | |
| parent | 7cd3b8c8395c48346444885a20ac91a07bdb0f35 (diff) | |
| parent | a75cc61f99c029c22fd100af524de7e2ba600d5e (diff) | |
| download | rust-ef7aa51f1cf8d262ff5b29f3d6730c6f83fbad42.tar.gz rust-ef7aa51f1cf8d262ff5b29f3d6730c6f83fbad42.zip | |
Auto merge of #136593 - lukas-code:ty-value-perf, r=oli-obk
valtree performance tuning Summary: This PR makes type checking of code with many type-level constants faster. After https://github.com/rust-lang/rust/pull/136180 was merged, we observed a small perf regression (https://github.com/rust-lang/rust/pull/136318#issuecomment-2635562821). This happened because that PR introduced additional copies in the fast reject code path for consts, which is very hot for certain crates: https://github.com/rust-lang/rust/blob/6c1d960d88dd3755548b3818630acb63fa98187e/compiler/rustc_type_ir/src/fast_reject.rs#L486-L487 This PR improves the performance again by properly interning the valtrees so that copying and comparing them becomes faster. This will become especially useful with `feature(adt_const_params)`, so the fast reject code doesn't have to do a deep compare of the valtrees. Note that we can't just compare the interned consts themselves in the fast reject, because sometimes `'static` lifetimes in the type are be replaced with inference variables (due to canonicalization) on one side but not the other. A less invasive alternative that I considered is simply avoiding copies introduced by https://github.com/rust-lang/rust/pull/136180 and comparing the valtrees it in-place (see commit: https://github.com/rust-lang/rust/commit/9e91e50ac5920f0b9b4a3b1e0880c85336ba5c64 / perf results: https://github.com/rust-lang/rust/pull/136593#issuecomment-2642303245), however that was still measurably slower than interning. There are some minor regressions in secondary benchmarks: These happen due to changes in memory allocations and seem acceptable to me. The crates that make heavy use of valtrees show no significant changes in memory usage.
| -rw-r--r-- | clippy_lints/src/non_copy_const.rs | 7 |
1 files changed, 4 insertions, 3 deletions
diff --git a/clippy_lints/src/non_copy_const.rs b/clippy_lints/src/non_copy_const.rs index 405bbfc9c6f..f965ab90da2 100644 --- a/clippy_lints/src/non_copy_const.rs +++ b/clippy_lints/src/non_copy_const.rs @@ -179,8 +179,8 @@ impl<'tcx> NonCopyConst<'tcx> { } fn is_value_unfrozen_raw_inner(cx: &LateContext<'tcx>, val: ty::ValTree<'tcx>, ty: Ty<'tcx>) -> bool { - // No branch that we check (yet) should continue if val isn't a ValTree::Branch - let ty::ValTree::Branch(val) = val else { return false }; + // No branch that we check (yet) should continue if val isn't a branch + let Some(val) = val.try_to_branch() else { return false }; match *ty.kind() { // the fact that we have to dig into every structs to search enums // leads us to the point checking `UnsafeCell` directly is the only option. @@ -192,9 +192,10 @@ impl<'tcx> NonCopyConst<'tcx> { .iter() .any(|field| Self::is_value_unfrozen_raw_inner(cx, *field, ty)), ty::Adt(def, args) if def.is_enum() => { - let Some((&ty::ValTree::Leaf(variant_index), fields)) = val.split_first() else { + let Some((&variant_valtree, fields)) = val.split_first() else { return false; }; + let variant_index = variant_valtree.unwrap_leaf(); let variant_index = VariantIdx::from_u32(variant_index.to_u32()); fields .iter() |
