about summary refs log tree commit diff
path: root/src/rt/rust_run_program.cpp
diff options
context:
space:
mode:
authorbors <bors@rust-lang.org>2013-04-20 11:57:50 -0700
committerbors <bors@rust-lang.org>2013-04-20 11:57:50 -0700
commitae3b8690c1e9c4debf20b4455ad50a79d5859ee9 (patch)
tree58406cf3c0ff22fc9ada505a40109c6acc779a2d /src/rt/rust_run_program.cpp
parent2b09267b762398a3c851ecfd55d5d01aee906352 (diff)
parentc5baeb1db3d84e1ab0d14a8055db3a7d3cba638d (diff)
downloadrust-ae3b8690c1e9c4debf20b4455ad50a79d5859ee9.tar.gz
rust-ae3b8690c1e9c4debf20b4455ad50a79d5859ee9.zip
auto merge of #5975 : huonw/rust/rustc-intrinsics-fixed-stack, r=pcwalton
This implements the fixed_stack_segment for items with the rust-intrinsic abi, and then uses it to make f32 and f64 use intrinsics where appropriate, but without overflowing stacks and killing canaries (cf. #5686 and #5697). Hopefully.

@pcwalton, the fixed_stack_segment implementation involved mirroring its implementation in `base.rs` in `trans_closure`, but without adding the `set_no_inline` (reasoning: that would defeat the purpose of intrinsics), which is possibly incorrect.

I'm a little hazy about how the underlying structure works, so I've annotated the 4 that have caused problems so far, but there's no guarantee that the other intrinsics are entirely well-behaved.

Anyway, it has good results (the following are just summing the result of each function for 1 up to 100 million):

```
$ ./intrinsics-perf.sh f32
func   new   old   speedup
sin    0.80  2.75  3.44
cos    0.80  2.76  3.45
sqrt   0.56  2.73  4.88
ln     1.01  2.94  2.91
log10  0.97  2.90  2.99
log2   1.01  2.95  2.92
exp    0.90  2.85  3.17
exp2   0.92  2.87  3.12
pow    6.95  8.57  1.23

   geometric mean: 2.97

$ ./intrinsics-perf.sh f64
func   new   old   speedup
sin    12.08  14.06  1.16
cos    12.04  13.67  1.14
sqrt   0.49  2.73  5.57
ln     4.11  5.59  1.36
log10  5.09  6.54  1.28
log2   2.78  5.10  1.83
exp    2.00  3.97  1.99
exp2   1.71  3.71  2.17
pow    5.90  7.51  1.27

   geometric mean: 1.72
```

So about 3x faster on average for f32, and 1.7x for f64. This isn't exactly apples to apples though, since this patch also adds #[inline(always)] to all the function definitions too, which possibly gives a speedup.

(fwiw, GitHub is showing 93c0888 after d9c54f8 (since I cherry-picked the latter from #5697), but git's order is the other way.)
Diffstat (limited to 'src/rt/rust_run_program.cpp')
0 files changed, 0 insertions, 0 deletions