about summary refs log tree commit diff
path: root/src/libstd/sys/unix/stack_overflow.rs
diff options
context:
space:
mode:
authorbors <bors@rust-lang.org>2016-01-16 01:18:48 +0000
committerbors <bors@rust-lang.org>2016-01-16 01:18:48 +0000
commite7e4ecc52267f35ae5f846e9e65e5fe83fc6b7be (patch)
treeff27a26becab647a909343046ac1b2d66d3e8c29 /src/libstd/sys/unix/stack_overflow.rs
parentcee9463d24ee3e5682888aa4ddfbcfc01a1f5e83 (diff)
parentcadcd70775cf42b2add2526026a0a06c1ced411c (diff)
downloadrust-e7e4ecc52267f35ae5f846e9e65e5fe83fc6b7be.tar.gz
rust-e7e4ecc52267f35ae5f846e9e65e5fe83fc6b7be.zip
Auto merge of #30740 - bluss:ascii-is-the-best, r=brson
Add fast path for ASCII in UTF-8 validation

This speeds up the ASCII case (and long stretches of ASCII in otherwise
mixed UTF-8 data) when checking UTF-8 validity.

Benchmark results suggest that on purely ASCII input, we can improve
throughput (megabytes verified / second) by a factor of 13 to 14 (smallish input).
On XML and mostly English language input (en.wikipedia XML dump),
throughput improves by a factor 7 (large input).

On mostly non-ASCII input, performance increases slightly or is the
same.

The UTF-8 validation is rewritten to use indexed access; since all
access is preceded by a (mandatory for validation) length check, bounds
checks are statically elided by LLVM and this formulation is in fact the best
for performance. A previous version had losses due to slice to iterator
conversions.

A large credit to Björn Steinbrink who improved this patch immensely,
writing this second version.

Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3.

Old code is `regular`, this PR is called `fast`.

Datasets:

- `ascii` is just ASCII (2.5 kB)
- `cyr` is cyrillic script with ascii spaces (5 kB)
- `dewik10` is 10MB of a de.wikipedia XML dump
- `enwik8` is 100MB of an en.wikipedia XML dump
- `jawik10` is 10MB of a ja.wikipedia XML dump

```
test from_utf8_ascii_fast        ... bench:         140 ns/iter (+/- 4) = 18221 MB/s
test from_utf8_ascii_regular     ... bench:       1,932 ns/iter (+/- 19) = 1320 MB/s
test from_utf8_cyr_fast          ... bench:      10,025 ns/iter (+/- 245) = 511 MB/s
test from_utf8_cyr_regular       ... bench:      10,944 ns/iter (+/- 795) = 468 MB/s
test from_utf8_dewik10_fast      ... bench:   6,017,909 ns/iter (+/- 105,755) = 1740 MB/s
test from_utf8_dewik10_regular   ... bench:  11,669,493 ns/iter (+/- 264,045) = 891 MB/s
test from_utf8_enwik8_fast       ... bench:  14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s
test from_utf8_enwik8_regular    ... bench:  93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s
test from_utf8_jawik10_fast      ... bench:  29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s
test from_utf8_jawik10_regular   ... bench:  29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s
```

Co-authored-by: Björn Steinbrink <bsteinbr@gmail.com>
Diffstat (limited to 'src/libstd/sys/unix/stack_overflow.rs')
0 files changed, 0 insertions, 0 deletions