diff options
| author | Stepan Koltsov <stepan.koltsov@gmail.com> | 2014-05-14 10:23:42 +0000 |
|---|---|---|
| committer | Stepan Koltsov <stepan.koltsov@gmail.com> | 2014-05-14 10:23:42 +0000 |
| commit | f853cf79b568fdffe83729aad4a43cb3c9ff3c92 (patch) | |
| tree | 5c91350f4520f16ec4f358b588c5403d9b8b39db /src/libsync | |
| parent | db5ca23118f1d96286a25d8627be3fe05ae51c5d (diff) | |
| download | rust-f853cf79b568fdffe83729aad4a43cb3c9ff3c92.tar.gz rust-f853cf79b568fdffe83729aad4a43cb3c9ff3c92.zip | |
Optimize common path of Once::doit
Optimize `Once::doit`: perform optimistic check that initializtion is
already completed. `load` is much cheaper than `fetch_add` at least
on x86_64.
Verified with this test:
```
static mut o: one::Once = one::ONCE_INIT;
unsafe {
loop {
let start = time::precise_time_ns();
let iters = 50000000u64;
for _ in range(0, iters) {
o.doit(|| { println!("once!"); });
}
let end = time::precise_time_ns();
let ps_per_iter = 1000 * (end - start) / iters;
println!("{} ps per iter", ps_per_iter);
// confuse the optimizer
o.doit(|| { println!("once!"); });
}
}
```
Test executed on Mac, Intel Core i7 2GHz. Result is:
* 20ns per iteration without patch
* 4ns per iteration with this patch applied
Once.doit could be even faster (800ps per iteration), if `doit` function
was split into a pair of `doit`/`doit_slow`, and `doit` marked as
`#[inline]` like this:
```
#[inline(always)]
pub fn doit(&self, f: ||) {
if self.cnt.load(atomics::SeqCst) < 0 {
return
}
self.doit_slow(f);
}
fn doit_slow(&self, f: ||) { ... }
```
Diffstat (limited to 'src/libsync')
| -rw-r--r-- | src/libsync/one.rs | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/src/libsync/one.rs b/src/libsync/one.rs index 7da6f39b840..eb919198708 100644 --- a/src/libsync/one.rs +++ b/src/libsync/one.rs @@ -64,6 +64,11 @@ impl Once { /// When this function returns, it is guaranteed that some initialization /// has run and completed (it may not be the closure specified). pub fn doit(&self, f: ||) { + // Optimize common path: load is much cheaper than fetch_add. + if self.cnt.load(atomics::SeqCst) < 0 { + return + } + // Implementation-wise, this would seem like a fairly trivial primitive. // The stickler part is where our mutexes currently require an // allocation, and usage of a `Once` should't leak this allocation. |
