diff options
| author | bors <bors@rust-lang.org> | 2014-10-11 18:37:13 +0000 |
|---|---|---|
| committer | bors <bors@rust-lang.org> | 2014-10-11 18:37:13 +0000 |
| commit | cd1fa91d2bf97a6331e1d0265eec0f3324191f89 (patch) | |
| tree | ee50ddbe24299b1a263167a9b00a0075561c08f6 /src | |
| parent | 4d031d7f86a66270d3cfc28a2ecf75706a67054b (diff) | |
| parent | f91c680e95bdb855c4086d9db0477a057b9f49f6 (diff) | |
| download | rust-cd1fa91d2bf97a6331e1d0265eec0f3324191f89.tar.gz rust-cd1fa91d2bf97a6331e1d0265eec0f3324191f89.zip | |
auto merge of #17801 : Gankro/rust/collections-stuff, r=sfackler
I previously avoided `#[inline]`ing anything assuming someone would come in and explain to me where this would be appropriate. Apparently no one *really* knows, so I'll just go the opposite way an inline everything assuming someone will come in and yell at me that such-and-such shouldn't be `#[inline]`. ================== For posterity, iteration comparisons: ``` test btree::map::bench::iter_20 ... bench: 971 ns/iter (+/- 30) test btree::map::bench::iter_1000 ... bench: 29445 ns/iter (+/- 480) test btree::map::bench::iter_100000 ... bench: 2929035 ns/iter (+/- 21551) test treemap::bench::iter_20 ... bench: 530 ns/iter (+/- 66) test treemap::bench::iter_1000 ... bench: 26287 ns/iter (+/- 825) test treemap::bench::iter_100000 ... bench: 7650084 ns/iter (+/- 356711) test trie::bench_map::iter_20 ... bench: 646 ns/iter (+/- 265) test trie::bench_map::iter_1000 ... bench: 43556 ns/iter (+/- 5014) test trie::bench_map::iter_100000 ... bench: 12988002 ns/iter (+/- 139676) ``` As you can see `btree` "scales" much better than `treemap`. `triemap` scales quite poorly. Note that *completely* different results are given if the elements are inserted in order from the range [0, size]. In particular, TrieMap *completely* dominates in the sorted case. This suggests adding benches for both might be worthwhile. However unsorted is *probably* the more "normal" case, so I consider this "good enough" for now.
Diffstat (limited to 'src')
| -rw-r--r-- | src/libcollections/btree/map.rs | 78 | ||||
| -rw-r--r-- | src/libcollections/btree/set.rs | 5 | ||||
| -rw-r--r-- | src/libcollections/treemap.rs | 35 | ||||
| -rw-r--r-- | src/libcollections/trie.rs | 40 |
4 files changed, 140 insertions, 18 deletions
diff --git a/src/libcollections/btree/map.rs b/src/libcollections/btree/map.rs index ab160595433..dbbff61b8dd 100644 --- a/src/libcollections/btree/map.rs +++ b/src/libcollections/btree/map.rs @@ -29,6 +29,47 @@ use ringbuf::RingBuf; /// A map based on a B-Tree. +/// +/// B-Trees represent a fundamental compromise between cache-efficiency and actually minimizing +/// the amount of work performed in a search. In theory, a binary search tree (BST) is the optimal +/// choice for a sorted map, as a perfectly balanced BST performs the theoretical minimum amount of +/// comparisons necessary to find an element (log<sub>2</sub>n). However, in practice the way this +/// is done is *very* inefficient for modern computer architectures. In particular, every element +/// is stored in its own individually heap-allocated node. This means that every single insertion +/// triggers a heap-allocation, and every single comparison should be a cache-miss. Since these +/// are both notably expensive things to do in practice, we are forced to at very least reconsider +/// the BST strategy. +/// +/// A B-Tree instead makes each node contain B-1 to 2B-1 elements in a contiguous array. By doing +/// this, we reduce the number of allocations by a factor of B, and improve cache effeciency in +/// searches. However, this does mean that searches will have to do *more* comparisons on average. +/// The precise number of comparisons depends on the node search strategy used. For optimal cache +/// effeciency, one could search the nodes linearly. For optimal comparisons, one could search +/// the node using binary search. As a compromise, one could also perform a linear search +/// that initially only checks every i<sup>th</sup> element for some choice of i. +/// +/// Currently, our implementation simply performs naive linear search. This provides excellent +/// performance on *small* nodes of elements which are cheap to compare. However in the future we +/// would like to further explore choosing the optimal search strategy based on the choice of B, +/// and possibly other factors. Using linear search, searching for a random element is expected +/// to take O(B log<sub>B</sub>n) comparisons, which is generally worse than a BST. In practice, +/// however, performance is excellent. `BTreeMap` is able to readily outperform `TreeMap` under +/// many workloads, and is competetive where it doesn't. BTreeMap also generally *scales* better +/// than TreeMap, making it more appropriate for large datasets. +/// +/// However, `TreeMap` may still be more appropriate to use in many contexts. If elements are very +/// large or expensive to compare, `TreeMap` may be more appropriate. It won't allocate any +/// more space than is needed, and will perform the minimal number of comparisons necessary. +/// `TreeMap` also provides much better performance stability guarantees. Generally, very few +/// changes need to be made to update a BST, and two updates are expected to take about the same +/// amount of time on roughly equal sized BSTs. However a B-Tree's performance is much more +/// amortized. If a node is overfull, it must be split into two nodes. If a node is underfull, it +/// may be merged with another. Both of these operations are relatively expensive to perform, and +/// it's possible to force one to occur at every single level of the tree in a single insertion or +/// deletion. In fact, a malicious or otherwise unlucky sequence of insertions and deletions can +/// force this degenerate behaviour to occur on every operation. While the total amount of work +/// done on each operation isn't *catastrophic*, and *is* still bounded by O(B log<sub>B</sub>n), +/// it is certainly much slower when it does. #[deriving(Clone)] pub struct BTreeMap<K, V> { root: Node<K, V>, @@ -93,6 +134,8 @@ impl<K: Ord, V> BTreeMap<K, V> { } /// Makes a new empty BTreeMap with the given B. + /// + /// B cannot be less than 2. pub fn with_b(b: uint) -> BTreeMap<K, V> { assert!(b > 1, "B must be greater than 1"); BTreeMap { @@ -1145,9 +1188,12 @@ mod test { #[cfg(test)] mod bench { - use test::Bencher; + use std::prelude::*; + use std::rand::{weak_rng, Rng}; + use test::{Bencher, black_box}; use super::BTreeMap; + use MutableMap; use deque::bench::{insert_rand_n, insert_seq_n, find_rand_n, find_seq_n}; #[bench] @@ -1200,4 +1246,34 @@ mod bench { let mut m : BTreeMap<uint,uint> = BTreeMap::new(); find_seq_n(10_000, &mut m, b); } + + fn bench_iter(b: &mut Bencher, size: uint) { + let mut map = BTreeMap::<uint, uint>::new(); + let mut rng = weak_rng(); + + for _ in range(0, size) { + map.swap(rng.gen(), rng.gen()); + } + + b.iter(|| { + for entry in map.iter() { + black_box(entry); + } + }); + } + + #[bench] + pub fn iter_20(b: &mut Bencher) { + bench_iter(b, 20); + } + + #[bench] + pub fn iter_1000(b: &mut Bencher) { + bench_iter(b, 1000); + } + + #[bench] + pub fn iter_100000(b: &mut Bencher) { + bench_iter(b, 100000); + } } diff --git a/src/libcollections/btree/set.rs b/src/libcollections/btree/set.rs index b21af89742c..8958f0ef5be 100644 --- a/src/libcollections/btree/set.rs +++ b/src/libcollections/btree/set.rs @@ -23,6 +23,9 @@ use core::fmt::Show; use {Mutable, Set, MutableSet, MutableMap, Map}; /// A set based on a B-Tree. +/// +/// See BTreeMap's documentation for a detailed discussion of this collection's performance +/// benefits and drawbacks. #[deriving(Clone, Hash, PartialEq, Eq, Ord, PartialOrd)] pub struct BTreeSet<T>{ map: BTreeMap<T, ()>, @@ -65,6 +68,8 @@ impl<T: Ord> BTreeSet<T> { } /// Makes a new BTreeSet with the given B. + /// + /// B cannot be less than 2. pub fn with_b(b: uint) -> BTreeSet<T> { BTreeSet { map: BTreeMap::with_b(b) } } diff --git a/src/libcollections/treemap.rs b/src/libcollections/treemap.rs index 73e2d846654..a86e7386edd 100644 --- a/src/libcollections/treemap.rs +++ b/src/libcollections/treemap.rs @@ -2232,9 +2232,12 @@ mod test_treemap { #[cfg(test)] mod bench { - use test::Bencher; + use std::prelude::*; + use std::rand::{weak_rng, Rng}; + use test::{Bencher, black_box}; use super::TreeMap; + use MutableMap; use deque::bench::{insert_rand_n, insert_seq_n, find_rand_n, find_seq_n}; // Find seq @@ -2288,6 +2291,36 @@ mod bench { let mut m : TreeMap<uint,uint> = TreeMap::new(); find_seq_n(10_000, &mut m, b); } + + fn bench_iter(b: &mut Bencher, size: uint) { + let mut map = TreeMap::<uint, uint>::new(); + let mut rng = weak_rng(); + + for _ in range(0, size) { + map.swap(rng.gen(), rng.gen()); + } + + b.iter(|| { + for entry in map.iter() { + black_box(entry); + } + }); + } + + #[bench] + pub fn iter_20(b: &mut Bencher) { + bench_iter(b, 20); + } + + #[bench] + pub fn iter_1000(b: &mut Bencher) { + bench_iter(b, 1000); + } + + #[bench] + pub fn iter_100000(b: &mut Bencher) { + bench_iter(b, 100000); + } } #[cfg(test)] diff --git a/src/libcollections/trie.rs b/src/libcollections/trie.rs index a022a1bc890..2e388a92a59 100644 --- a/src/libcollections/trie.rs +++ b/src/libcollections/trie.rs @@ -949,8 +949,8 @@ macro_rules! iterator_impl { // rules, and are just manipulating raw pointers like there's no // such thing as invalid pointers and memory unsafety. The // reason is performance, without doing this we can get the - // bench_iter_large microbenchmark down to about 30000 ns/iter - // (using .unsafe_get to index self.stack directly, 38000 + // (now replaced) bench_iter_large microbenchmark down to about + // 30000 ns/iter (using .unsafe_get to index self.stack directly, 38000 // ns/iter with [] checked indexing), but this smashes that down // to 13500 ns/iter. // @@ -1459,31 +1459,39 @@ mod test_map { mod bench_map { use std::prelude::*; use std::rand::{weak_rng, Rng}; - use test::Bencher; + use test::{Bencher, black_box}; use MutableMap; use super::TrieMap; - #[bench] - fn bench_iter_small(b: &mut Bencher) { - let mut m = TrieMap::<uint>::new(); + fn bench_iter(b: &mut Bencher, size: uint) { + let mut map = TrieMap::<uint>::new(); let mut rng = weak_rng(); - for _ in range(0u, 20) { - m.insert(rng.gen(), rng.gen()); + + for _ in range(0, size) { + map.swap(rng.gen(), rng.gen()); } - b.iter(|| for _ in m.iter() {}) + b.iter(|| { + for entry in map.iter() { + black_box(entry); + } + }); } #[bench] - fn bench_iter_large(b: &mut Bencher) { - let mut m = TrieMap::<uint>::new(); - let mut rng = weak_rng(); - for _ in range(0u, 1000) { - m.insert(rng.gen(), rng.gen()); - } + pub fn iter_20(b: &mut Bencher) { + bench_iter(b, 20); + } - b.iter(|| for _ in m.iter() {}) + #[bench] + pub fn iter_1000(b: &mut Bencher) { + bench_iter(b, 1000); + } + + #[bench] + pub fn iter_100000(b: &mut Bencher) { + bench_iter(b, 100000); } #[bench] |
