about summary refs log tree commit diff
path: root/src/rt/rust_stack.cpp
diff options
context:
space:
mode:
authorblake2-ppc <blake2-ppc>2013-08-02 18:34:00 +0200
committerblake2-ppc <blake2-ppc>2013-08-02 23:20:57 +0200
commit0504d7e57bf536dabbb738b5b0d268a266d30659 (patch)
treeb5beb1a13b3840166b1167fc8699aa7d55dd614a /src/rt/rust_stack.cpp
parent2460170e6aea8dd7aa3e316456047baf18f2f680 (diff)
downloadrust-0504d7e57bf536dabbb738b5b0d268a266d30659.tar.gz
rust-0504d7e57bf536dabbb738b5b0d268a266d30659.zip
std: Speed up str::is_utf8
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.

Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.

No changes to what we accept. A comment notes that surrogate halves are
accepted.

Before:

	test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
	test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)

After:
	test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
	test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
Diffstat (limited to 'src/rt/rust_stack.cpp')
0 files changed, 0 insertions, 0 deletions