summary refs log tree commit diff
path: root/compiler/rustc_data_structures/src/lib.rs
AgeCommit message (Collapse)AuthorLines
2022-05-06Auto merge of #94598 - scottmcm:prefix-free-hasher-methods, r=Amanieubors-0/+1
Add a dedicated length-prefixing method to `Hasher` This accomplishes two main goals: - Make it clear who is responsible for prefix-freedom, including how they should do it - Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future. Fixes #94026 r? rust-lang/libs --- The core of this change is the following two new methods on `Hasher`: ```rust pub trait Hasher { /// Writes a length prefix into this hasher, as part of being prefix-free. /// /// If you're implementing [`Hash`] for a custom collection, call this before /// writing its contents to this `Hasher`. That way /// `(collection![1, 2, 3], collection![4, 5])` and /// `(collection![1, 2], collection![3, 4, 5])` will provide different /// sequences of values to the `Hasher` /// /// The `impl<T> Hash for [T]` includes a call to this method, so if you're /// hashing a slice (or array or vector) via its `Hash::hash` method, /// you should **not** call this yourself. /// /// This method is only for providing domain separation. If you want to /// hash a `usize` that represents part of the *data*, then it's important /// that you pass it to [`Hasher::write_usize`] instead of to this method. /// /// # Examples /// /// ``` /// #![feature(hasher_prefixfree_extras)] /// # // Stubs to make the `impl` below pass the compiler /// # struct MyCollection<T>(Option<T>); /// # impl<T> MyCollection<T> { /// # fn len(&self) -> usize { todo!() } /// # } /// # impl<'a, T> IntoIterator for &'a MyCollection<T> { /// # type Item = T; /// # type IntoIter = std::iter::Empty<T>; /// # fn into_iter(self) -> Self::IntoIter { todo!() } /// # } /// /// use std::hash::{Hash, Hasher}; /// impl<T: Hash> Hash for MyCollection<T> { /// fn hash<H: Hasher>(&self, state: &mut H) { /// state.write_length_prefix(self.len()); /// for elt in self { /// elt.hash(state); /// } /// } /// } /// ``` /// /// # Note to Implementers /// /// If you've decided that your `Hasher` is willing to be susceptible to /// Hash-DoS attacks, then you might consider skipping hashing some or all /// of the `len` provided in the name of increased performance. #[inline] #[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")] fn write_length_prefix(&mut self, len: usize) { self.write_usize(len); } /// Writes a single `str` into this hasher. /// /// If you're implementing [`Hash`], you generally do not need to call this, /// as the `impl Hash for str` does, so you can just use that. /// /// This includes the domain separator for prefix-freedom, so you should /// **not** call `Self::write_length_prefix` before calling this. /// /// # Note to Implementers /// /// The default implementation of this method includes a call to /// [`Self::write_length_prefix`], so if your implementation of `Hasher` /// doesn't care about prefix-freedom and you've thus overridden /// that method to do nothing, there's no need to override this one. /// /// This method is available to be overridden separately from the others /// as `str` being UTF-8 means that it never contains `0xFF` bytes, which /// can be used to provide prefix-freedom cheaper than hashing a length. /// /// For example, if your `Hasher` works byte-by-byte (perhaps by accumulating /// them into a buffer), then you can hash the bytes of the `str` followed /// by a single `0xFF` byte. /// /// If your `Hasher` works in chunks, you can also do this by being careful /// about how you pad partial chunks. If the chunks are padded with `0x00` /// bytes then just hashing an extra `0xFF` byte doesn't necessarily /// provide prefix-freedom, as `"ab"` and `"ab\u{0}"` would likely hash /// the same sequence of chunks. But if you pad with `0xFF` bytes instead, /// ensuring at least one padding byte, then it can often provide /// prefix-freedom cheaper than hashing the length would. #[inline] #[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")] fn write_str(&mut self, s: &str) { self.write_length_prefix(s.len()); self.write(s.as_bytes()); } } ``` With updates to the `Hash` implementations for slices and containers to call `write_length_prefix` instead of `write_usize`. `write_str` defaults to using `write_length_prefix` since, as was pointed out in the issue, the `write_u8(0xFF)` approach is insufficient for hashers that work in chunks, as those would hash `"a\u{0}"` and `"a"` to the same thing. But since `SipHash` works byte-wise (there's an internal buffer to accumulate bytes until a full chunk is available) it overrides `write_str` to continue to use the add-non-UTF-8-byte approach. --- Compatibility: Because the default implementation of `write_length_prefix` calls `write_usize`, the changed hash implementation for slices will do the same thing the old one did on existing `Hasher`s.
2022-05-06Add a dedicated length-prefixing method to `Hasher`Scott McMurray-0/+1
This accomplishes two main goals: - Make it clear who is responsible for prefix-freedom, including how they should do it - Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
2022-05-04Stabilize `bool::then_some`Josh Triplett-1/+0
2022-04-16Auto merge of #95899 - petrochenkov:modchild2, r=cjgillotbors-0/+25
rustc_metadata: Do not encode unnecessary module children This should remove the syntax context shift and the special case for `ExternCrate` in decoder in https://github.com/rust-lang/rust/pull/95880. This PR also shifts some work from decoding to encoding, which is typically useful for performance (but probably not much in this case). r? `@cjgillot`
2022-04-14make unaligned_references lint deny-by-defaultRalf Jung-1/+0
2022-04-13rustc_metadata: Do not encode unnecessary module childrenVadim Petrochenkov-0/+25
2022-03-04Add SmallStrTomasz Miąsko-0/+1
2022-02-25Switch bootstrap cfgsMark Rousskov-1/+1
2022-02-23Introduce `ChunkedBitSet` and use it for some dataflow analyses.Nicholas Nethercote-0/+2
This reduces peak memory usage significantly for some programs with very large functions, such as: - `keccak`, `unicode_normalization`, and `match-stress-enum`, from the `rustc-perf` benchmark suite; - `http-0.2.6` from crates.io. The new type is used in the analyses where the bitsets can get huge (e.g. 10s of thousands of bits): `MaybeInitializedPlaces`, `MaybeUninitializedPlaces`, and `EverInitializedPlaces`. Some refactoring was required in `rustc_mir_dataflow`. All existing analysis domains are either `BitSet` or a trivial wrapper around `BitSet`, and access in a few places is done via `Borrow<BitSet>` or `BorrowMut<BitSet>`. Now that some of these domains are `ClusterBitSet`, that no longer works. So this commit replaces the `Borrow`/`BorrowMut` usage with a new trait `BitSetExt` containing the needed bitset operations. The impls just forward these to the underlying bitset type. This required fiddling with trait bounds in a few places. The commit also: - Moves `static_assert_size` from `rustc_data_structures` to `rustc_index` so it can be used in the latter; the former now re-exports it so existing users are unaffected. - Factors out some common "clear excess bits in the final word" functionality in `bit_set.rs`. - Uses `fill` in a few places instead of loops.
2022-02-19Adopt let else in more placesest31-0/+1
2022-02-15Rename `PtrKey` as `Interned` and improve it.Nicholas Nethercote-1/+2
In particular, there's now more protection against incorrect usage, because you can only create one via `Interned::new_unchecked`, which makes it more obvious that you must be careful. There are also some tests.
2022-02-01add a rustc::query_stability lintlcnr-0/+1
2021-12-07Make IdFunctor::try_map_id panic-safeAlan Egerton-0/+1
2021-12-05Stop enabling `in_band_lifetimes` in rustc_data_structuresScott McMurray-2/+0
There's a conversation in the tracking issue about possibly unaccepting `in_band_lifetimes`, but it's used heavily in the compiler, and thus there'd need to be a bunch of PRs like this if that were to happen. So here's one to see how much of an impact it has. (Oh, and I removed `nll` while I was here too, since it didn't seem needed. Let me know if I should put that back.)
2021-12-02Remove no-longer used `IdFunctor::map_id`Alan Egerton-1/+0
2021-11-27Delegate from `map_id` to `try_map_id`Alan Egerton-0/+1
2021-10-28Revert "Add rustc lint, warning when iterating over hashmaps"Mark Rousskov-1/+0
2021-10-25Auto merge of #90042 - pietroalbini:1.56-master, r=Mark-Simulacrumbors-1/+0
Bump bootstrap compiler to 1.57 Fixes https://github.com/rust-lang/rust/issues/90152 r? `@Mark-Simulacrum`
2021-10-23update cfg(bootstrap)Pietro Albini-1/+0
2021-10-15allow `potential_query_instability` everywherelcnr-0/+1
2021-10-04Rollup merge of #89508 - jhpratt:stabilize-const_panic, r=joshtriplettJubilee-1/+1
Stabilize `const_panic` Closes #51999 FCP completed in #89006 ```@rustbot``` label +A-const-eval +A-const-fn +T-lang cc ```@oli-obk``` for review (not `r?`'ing as not on lang team)
2021-10-04Stabilize `const_panic`Jacob Pratt-1/+1
2021-10-02Remove various unused feature gatesbjorn3-1/+0
2021-09-17Stabilize `Iterator::map_while`Maybe Waffle-1/+0
2021-09-10rustc: Remove local variable IDs from `Export`sVadim Petrochenkov-0/+1
Local variables can never be exported.
2021-09-08Bump stage0 compiler to 1.56Mark Rousskov-2/+1
2021-07-27Use type_alias_impl_trait instead of min in compiler and libSantiago Pastorino-1/+2
2021-07-23Sort features alphabeticallyYuki Okushi-13/+13
2021-07-23Use `map_while` instead of `take_while` + `map`Yuki Okushi-0/+2
2021-07-22Fix VecMap::iter_mutOli Scherer-0/+1
It used to allow you to mutate the key, even though that can invalidate the map by creating duplicate keys.
2021-06-11Auto merge of #85885 - bjorn3:remove_box_region, r=cjgillotbors-2/+0
Don't use a generator for BoxedResolver The generator is non-trivial and requires unsafe code anyway. Using regular unsafe code without a generator is much easier to follow. Based on #85810 as it touches rustc_interface too.
2021-06-08Inline the rest of box_regionbjorn3-2/+0
2021-06-07Add VecMap to rustc_data_structuresSantiago Pastorino-0/+1
2021-05-31Remove unused feature gatesbjorn3-1/+0
2021-05-31Remove unnecessary unboxed_closures feature usagebjorn3-2/+0
It has been possible to clone closures for a while now
2021-03-30Add an Mmap wrapper to rustc_data_structuresbjorn3-0/+1
This wrapper implements StableAddress and falls back to directly reading the file on wasm32
2021-03-24Revert "Revert stabilizing integer::BITS."Mara Bos-1/+0
2021-02-27Rollup merge of #82057 - upsuper-forks:cstr, r=davidtwco,wesleywiserDylan DPC-1/+0
Replace const_cstr with cstr crate This PR replaces the `const_cstr` macro inside `rustc_data_structures` with `cstr` macro from [cstr](https://crates.io/crates/cstr) crate. The two macros basically serve the same purpose, which is to generate `&'static CStr` from a string literal. `cstr` is better because it validates the literal at compile time, while the existing `const_cstr` does it at runtime when `debug_assertions` is enabled. In addition, the value `cstr` generates can be used in constant context (which is seemingly not needed anywhere currently, though).
2021-02-20Update the bootstrap compilerJoshua Nelson-1/+0
Note this does not change `core::derive` since it was merged after the beta bump.
2021-02-14Replace const_cstr with cstr crateXidorn Quan-1/+0
2021-02-03Revert stabilizing integer::BITS.Mara Bos-0/+1
2021-01-31stabilize int_bits_constAshley Mannix-1/+0
2020-12-30Bump bootstrap compiler to 1.50 betaMark Rousskov-2/+1
2020-12-26stabilize min_const_genericsBastian Kauschke-1/+1
2020-11-23Rename `optin_builtin_traits` to `auto_traits`Camelid-1/+2
They were originally called "opt-in, built-in traits" (OIBITs), but people realized that the name was too confusing and a mouthful, and so they were renamed to just "auto traits". The feature flag's name wasn't updated, though, so that's what this PR does. There are some other spots in the compiler that still refer to OIBITs, but I don't think changing those now is worth it since they are internal and not particularly relevant to this PR. Also see <https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/opt-in.2C.20built-in.20traits.20(auto.20traits).20feature.20name>.
2020-11-20Set unaligned_references lint to deny in rustc_data_structuresTyson Nottingham-0/+1
To detect misuse of private packed field in `PackedFingerprint`.
2020-11-16compiler: fold by valueBastian Kauschke-0/+2
2020-11-15Rollup merge of #79058 - dtolnay:likelymacro, r=Mark-SimulacrumJonas Schievink-6/+6
Move likely/unlikely argument outside of invisible unsafe block The previous `likely!`/`unlikely!` macros were unsound because it permits the caller's expr to contain arbitrary unsafe code. ```rust pub fn huh() -> bool { likely!(std::ptr::read(&() as *const () as *const bool)) } ``` **Before:** compiles cleanly. **After:** ```console error[E0133]: call to unsafe function is unsafe and requires unsafe function or block | 70 | likely!(std::ptr::read(&() as *const () as *const bool)) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function | = note: consult the function's documentation for information on how to avoid undefined behavior ```
2020-11-14Move likely/unlikely argument outside of invisible unsafe blockDavid Tolnay-6/+6
The previous `likely!`/`unlikely!` macros were unsound because it permits the caller's expr to contain arbitrary unsafe code. pub fn huh() -> bool { likely!(std::ptr::read(&() as *const () as *const bool)) } Before: compiles cleanly. After: error[E0133]: call to unsafe function is unsafe and requires unsafe function or block | 70 | likely!(std::ptr::read(&() as *const () as *const bool)) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function | = note: consult the function's documentation for information on how to avoid undefined behavior
2020-11-14Move Steal to rustc_data_structures.Camille GILLOT-0/+1