auto merge of #8237 : blake2-ppc/rust/faster-utf8, r=brson - rust

diff options

author	bors <bors@rust-lang.org>	2013-08-04 07:10:56 -0700
committer	bors <bors@rust-lang.org>	2013-08-04 07:10:56 -0700
commit	f7c4359a2c350521c312bd5dce3ce515878d922a (patch)
tree	e42481e6e166a20dd151cb5fdfe3086be959def9 /src/rt/rust_stack.cpp
parent	5cf69d5bf8fa6424eca5b1589e90db5a19f16330 (diff)
parent	0504d7e57bf536dabbb738b5b0d268a266d30659 (diff)
download	rust-f7c4359a2c350521c312bd5dce3ce515878d922a.tar.gz rust-f7c4359a2c350521c312bd5dce3ce515878d922a.zip

auto merge of #8237 : blake2-ppc/rust/faster-utf8, r=brson

Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.

Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.

No changes to what we accept. A comment notes that surrogate halves are
accepted.

Before:

	test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
	test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)

After:
	test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
	test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)

An improvement upon the previous pull #8133

Diffstat (limited to 'src/rt/rust_stack.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: