diff options
| author | bors <bors@rust-lang.org> | 2015-08-13 13:29:38 +0000 |
|---|---|---|
| committer | bors <bors@rust-lang.org> | 2015-08-13 13:29:38 +0000 |
| commit | bb954cfa75ec63b2cf28229ecc4aee7024381d0b (patch) | |
| tree | b59325f7d0162247a2a348932f9e167f66573c2e /src/etc | |
| parent | e9205a20a8a22ab066b6596502cafc036fd4557c (diff) | |
| parent | 15518a9c0c1c5f1774503c72072c73e64411d130 (diff) | |
| download | rust-bb954cfa75ec63b2cf28229ecc4aee7024381d0b.tar.gz rust-bb954cfa75ec63b2cf28229ecc4aee7024381d0b.zip | |
Auto merge of #27307 - rkruppe:dec2flt, r=pnkfelix
Completely rewrite the conversion of decimal strings to `f64` and `f32`. The code is intended to be absolutely positively completely 100% accurate (when it doesn't give up). To the best of my knowledge, it achieves that goal. Any input that is not rejected is converted to the floating point number that is closest to the true value of the input. This includes overflow, subnormal numbers, and underflow to zero. In other words, the rounding error is less than or equal to 0.5 units in the last place. Half-way cases (exactly 0.5 ULP error) are handled with half-to-even rounding, also known as banker's rounding.
This code implements the algorithms from the paper [How to Read Floating Point Numbers Accurately][paper] by William D. Clinger, with extensions to handle underflow, overflow and subnormals, as well as some algorithmic optimizations.
# Correctness
With such a large amount of tricky code, many bugs are to be expected. Indeed tracking down the obscure causes of various rounding errors accounts for the bulk of the development time. Extensive tests (taking in the order of hours to run through to completion) are included in `src/etc/test-float-parse`: Though exhaustively testing all possible inputs is impossible, I've had good success with generating millions of instances from various "classes" of inputs. These tests take far too long to be run by @bors so contributors who touch this code need the discipline to run them. There are `#[test]`s, but they don't even cover every stupid mistake I made in course of writing this.
Another aspect is *integer* overflow. Extreme (or malicious) inputs could cause overflow both in the machine-sized integers used for bookkeeping throughout the algorithms (e.g., the decimal exponent) as well as the arbitrary-precision arithmetic. There is input validation to reject all such cases I know of, and I am quite sure nobody will *accidentally* cause this code to go out of range. Still, no guarantees.
# Limitations
Noticed the weasel words "(when it doesn't give up)" at the beginning? Some otherwise well-formed decimal strings are rejected because spelling out the value of the input requires too many digits, i.e., `digits * 10^abs(exp)` can't be stored in a bignum. This only applies if the value is not "obviously" zero or infinite, i.e., if you take a near-infinity or near-zero value and add many pointless fractional digits. At least with the algorithm used here, computing the precise value would require computing the full value as a fraction, which would overflow. The precise limit is `number_of_digits + abs(exp) > 375` but could be raised almost arbitrarily. In the future, another algorithm might lift this restriction entirely.
This should not be an issue for any realistic inputs. Still, the code does reject inputs that would result in a finite float when evaluated with unlimited precision. Some of these inputs are even regressions that the old code (mostly) handled, such as `0.333...333` with 400+ `3`s. Thus this might qualify as [breaking-change].
# Performance
Benchmarks results are... tolerable. Short numbers that hit the fast paths (`f64` multiplication or shortcuts to zero/inf) have performance in the same order of magnitude as the old code tens of nanoseconds. Numbers that are delegated to Algorithm Bellerophon (using floats with 64 bit significand, implemented in software) are slower, but not drastically so (couple hundred nanoseconds).
Numbers that need the AlgorithmM fallback (for `f64`, roughly everything below 1e-305 and above 1e305) take far, far longer, hundreds of microseconds. Note that my implementation is not quite as naive as the expository version in the paper (it needs one to four division instead of ~1000), but division is fundamentally pretty expensive and my implementation of it is extremely simple and slow.
All benchmarks run on a mediocre laptop with a i5-4200U CPU under light load.
# Binary size
Unfortunately the implementation needs to duplicate almost all code: Once for `f32` and once for `f64`. Before you ask, no, this cannot be avoided, at least not completely (but see the Future Work section). There's also a precomputed table of powers of ten, weighing in at about six kilobytes.
Running a stage1 `rustc` over a stand-alone program that simply parses pi to `f32` and `f64` and outputs both results reveals that the overhead vs. the old parsing code is about 44 KiB normally and about 28 KiB with LTO. It's presumably half of that + 3 KiB when only one of the two code paths is exercised.
| rustc options | old | new | delta |
|--------------------------- |--------- |--------- |----------- |
| [nothing] | 2588375 | 2633828 | 44.39 KiB |
| -O | 2585211 | 2630688 | 44.41 KiB |
| -O -C lto | 1026353 | 1054981 | 27.96 KiB |
| -O -C lto -C link-args=-s | 414208 | 442368 | 27.5 KiB |
# Future Work
## Directory layout
The `dec2flt` code uses some types embedded deeply in the `flt2dec` module hierarchy, even though nothing about them it formatting-specific. They should be moved to a more conversion-direction-agnostic location at some point.
## Performance
It could be much better, especially for large inputs. Some low-hanging fruit has been picked but much more work could be done. Some specific ideas are jotted down in `FIXME`s all over the code.
## Binary size
One could try to compress the table further, though I am skeptical. Another avenue would be reducing the code duplication from basically everything being generic over `T: RawFloat`. Perhaps one can reduce the magnitude of the duplication by pushing the parts that don't need to know the target type into separate functions, but this is finicky and probably makes some code read less naturally.
## Other bases
This PR leaves `f{32,64}::from_str_radix` alone. It only replaces `FromStr` (and thus `.parse()`). I am convinced that `from_str_radix` should not exist, and have proposed its [deprecation and speedy removal][deprecate-radix]. Whatever the outcome of that discussion, it is independent from, and out of scope for, this PR.
Fixes #24557
Fixes #14353
r? @pnkfelix
cc @lifthrasiir @huonw
[paper]: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.4152
[deprecate-radix]: https://internals.rust-lang.org/t/deprecate-f-32-64-from-str-radix/2405
Diffstat (limited to 'src/etc')
| -rw-r--r-- | src/etc/dec2flt_table.py | 134 | ||||
| -rw-r--r-- | src/etc/test-float-parse/_common.rs | 26 | ||||
| -rw-r--r-- | src/etc/test-float-parse/few-ones.rs | 27 | ||||
| -rw-r--r-- | src/etc/test-float-parse/huge-pow10.rs | 21 | ||||
| -rw-r--r-- | src/etc/test-float-parse/many-digits.rs | 39 | ||||
| -rw-r--r-- | src/etc/test-float-parse/rand-f64.rs | 32 | ||||
| -rw-r--r-- | src/etc/test-float-parse/runtests.py | 399 | ||||
| -rw-r--r-- | src/etc/test-float-parse/short-decimals.rs | 29 | ||||
| -rw-r--r-- | src/etc/test-float-parse/subnorm.rs | 23 | ||||
| -rw-r--r-- | src/etc/test-float-parse/tiny-pow10.rs | 21 | ||||
| -rw-r--r-- | src/etc/test-float-parse/u32-small.rs | 19 | ||||
| -rw-r--r-- | src/etc/test-float-parse/u64-pow2.rs | 28 |
12 files changed, 798 insertions, 0 deletions
diff --git a/src/etc/dec2flt_table.py b/src/etc/dec2flt_table.py new file mode 100644 index 00000000000..b0140fb2455 --- /dev/null +++ b/src/etc/dec2flt_table.py @@ -0,0 +1,134 @@ +#!/usr/bin/env python2.7 +# +# Copyright 2015 The Rust Project Developers. See the COPYRIGHT +# file at the top-level directory of this distribution and at +# http://rust-lang.org/COPYRIGHT. +# +# Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +# http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +# <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +# option. This file may not be copied, modified, or distributed +# except according to those terms. + +""" +Generate powers of ten using William Clinger's ``AlgorithmM`` for use in +decimal to floating point conversions. + +Specifically, computes and outputs (as Rust code) a table of 10^e for some +range of exponents e. The output is one array of 64 bit significands and +another array of corresponding base two exponents. The approximations are +normalized and rounded perfectly, i.e., within 0.5 ULP of the true value. + +The representation ([u64], [i16]) instead of the more natural [(u64, i16)] +is used because (u64, i16) has a ton of padding which would make the table +even larger, and it's already uncomfortably large (6 KiB). +""" +from __future__ import print_function +import sys +from fractions import Fraction +from collections import namedtuple + + +N = 64 # Size of the significand field in bits +MIN_SIG = 2 ** (N - 1) +MAX_SIG = (2 ** N) - 1 + + +# Hand-rolled fp representation without arithmetic or any other operations. +# The significand is normalized and always N bit, but the exponent is +# unrestricted in range. +Fp = namedtuple('Fp', 'sig exp') + + +def algorithm_m(f, e): + assert f > 0 + if e < 0: + u = f + v = 10 ** abs(e) + else: + u = f * 10 ** e + v = 1 + k = 0 + x = u // v + while True: + if x < MIN_SIG: + u <<= 1 + k -= 1 + elif x >= MAX_SIG: + v <<= 1 + k += 1 + else: + break + x = u // v + return ratio_to_float(u, v, k) + + +def ratio_to_float(u, v, k): + q, r = divmod(u, v) + v_r = v - r + z = Fp(q, k) + if r < v_r: + return z + elif r > v_r: + return next_float(z) + elif q % 2 == 0: + return z + else: + return next_float(z) + + +def next_float(z): + if z.sig == MAX_SIG: + return Fp(MIN_SIG, z.exp + 1) + else: + return Fp(z.sig + 1, z.exp) + + +def error(f, e, z): + decimal = f * Fraction(10) ** e + binary = z.sig * Fraction(2) ** z.exp + abs_err = abs(decimal - binary) + # The unit in the last place has value z.exp + ulp_err = abs_err / Fraction(2) ** z.exp + return float(ulp_err) + +LICENSE = """ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. +""" + +def main(): + MIN_E = -305 + MAX_E = 305 + e_range = range(MIN_E, MAX_E+1) + powers = [] + for e in e_range: + z = algorithm_m(1, e) + err = error(1, e, z) + assert err < 0.5 + powers.append(z) + typ = "([u64; {0}], [i16; {0}])".format(len(e_range)) + print(LICENSE.strip()) + print("// Table of approximations of powers of ten.") + print("// DO NOT MODIFY: Generated by a src/etc/dec2flt_table.py") + print("pub const MIN_E: i16 = {};".format(MIN_E)) + print("pub const MAX_E: i16 = {};".format(MAX_E)) + print() + print("pub const POWERS: ", typ, " = ([", sep='') + for z in powers: + print(" 0x{:x},".format(z.sig)) + print("], [") + for z in powers: + print(" {},".format(z.exp)) + print("]);") + + +if __name__ == '__main__': + main() diff --git a/src/etc/test-float-parse/_common.rs b/src/etc/test-float-parse/_common.rs new file mode 100644 index 00000000000..b4a2a593e01 --- /dev/null +++ b/src/etc/test-float-parse/_common.rs @@ -0,0 +1,26 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +use std::io; +use std::io::prelude::*; +use std::mem::transmute; + +// Nothing up my sleeve: Just (PI - 3) in base 16. +#[allow(dead_code)] +pub const SEED: [u32; 3] = [0x243f_6a88, 0x85a3_08d3, 0x1319_8a2e]; + +pub fn validate(text: String) { + let mut out = io::stdout(); + let x: f64 = text.parse().unwrap(); + let f64_bytes: u64 = unsafe { transmute(x) }; + let x: f32 = text.parse().unwrap(); + let f32_bytes: u32 = unsafe { transmute(x) }; + writeln!(&mut out, "{:016x} {:08x} {}", f64_bytes, f32_bytes, text).unwrap(); +} diff --git a/src/etc/test-float-parse/few-ones.rs b/src/etc/test-float-parse/few-ones.rs new file mode 100644 index 00000000000..390f4da6f63 --- /dev/null +++ b/src/etc/test-float-parse/few-ones.rs @@ -0,0 +1,27 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +mod _common; + +use _common::validate; + +fn main() { + let mut pow = vec![]; + for i in 0..63 { + pow.push(1u64 << i); + } + for a in &pow { + for b in &pow { + for c in &pow { + validate((a | b | c).to_string()); + } + } + } +} diff --git a/src/etc/test-float-parse/huge-pow10.rs b/src/etc/test-float-parse/huge-pow10.rs new file mode 100644 index 00000000000..e62afc78515 --- /dev/null +++ b/src/etc/test-float-parse/huge-pow10.rs @@ -0,0 +1,21 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +mod _common; + +use _common::validate; + +fn main() { + for e in 300..310 { + for i in 0..100000 { + validate(format!("{}e{}", i, e)); + } + } +} diff --git a/src/etc/test-float-parse/many-digits.rs b/src/etc/test-float-parse/many-digits.rs new file mode 100644 index 00000000000..0cbf57183df --- /dev/null +++ b/src/etc/test-float-parse/many-digits.rs @@ -0,0 +1,39 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +#![feature(rand)] + +extern crate rand; + +mod _common; + +use std::char; +use rand::{IsaacRng, Rng, SeedableRng}; +use rand::distributions::{Range, Sample}; +use _common::{validate, SEED}; + +fn main() { + let mut rnd = IsaacRng::from_seed(&SEED); + let mut range = Range::new(0, 10); + for _ in 0..5_000_000u64 { + let num_digits = rnd.gen_range(100, 300); + let digits = gen_digits(num_digits, &mut range, &mut rnd); + validate(digits); + } +} + +fn gen_digits<R: Rng>(n: u32, range: &mut Range<u32>, rnd: &mut R) -> String { + let mut s = String::new(); + for _ in 0..n { + let digit = char::from_digit(range.sample(rnd), 10).unwrap(); + s.push(digit); + } + s +} diff --git a/src/etc/test-float-parse/rand-f64.rs b/src/etc/test-float-parse/rand-f64.rs new file mode 100644 index 00000000000..762c3d95ec6 --- /dev/null +++ b/src/etc/test-float-parse/rand-f64.rs @@ -0,0 +1,32 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +#![feature(rand)] + +extern crate rand; + +mod _common; + +use _common::{validate, SEED}; +use rand::{IsaacRng, Rng, SeedableRng}; +use std::mem::transmute; + +fn main() { + let mut rnd = IsaacRng::from_seed(&SEED); + let mut i = 0; + while i < 10_000_000 { + let bits = rnd.next_u64(); + let x: f64 = unsafe { transmute(bits) }; + if x.is_finite() { + validate(format!("{:e}", x)); + i += 1; + } + } +} diff --git a/src/etc/test-float-parse/runtests.py b/src/etc/test-float-parse/runtests.py new file mode 100644 index 00000000000..17a1b769bd6 --- /dev/null +++ b/src/etc/test-float-parse/runtests.py @@ -0,0 +1,399 @@ +#!/usr/bin/python2.7 +# +# Copyright 2015 The Rust Project Developers. See the COPYRIGHT +# file at the top-level directory of this distribution and at +# http://rust-lang.org/COPYRIGHT. +# +# Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +# http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +# <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +# option. This file may not be copied, modified, or distributed +# except according to those terms. + +""" +Testing dec2flt +=============== +These are *really* extensive tests. Expect them to run for hours. Due to the +nature of the problem (the input is a string of arbitrary length), exhaustive +testing is not really possible. Instead, there are exhaustive tests for some +classes of inputs for which that is feasible and a bunch of deterministic and +random non-exhaustive tests for covering everything else. + +The actual tests (generating decimal strings and feeding them to dec2flt) is +performed by a set of stand-along rust programs. This script compiles, runs, +and supervises them. In particular, the programs report the strings they +generate and the floating point numbers they converted those strings to. + +You can run specific tests rather than all of them by giving their names +(without .rs extension) as command line parameters. + +Verification +------------ +The tricky part is not generating those inputs but verifying the outputs. +Comparing with the result of Python's float() does not cut it because +(and this is apparently undocumented) although Python includes a version of +Martin Gay's code including the decimal-to-float part, it doesn't actually use +it for float() (only for round()) instead relying on the system scanf() which +is not necessarily completely accurate. + +Instead, we take the input and compute the true value with bignum arithmetic +(as a fraction, using the ``fractions`` module). + +Given an input string and the corresponding float computed via Rust, simply +decode the float into f * 2^k (for intergers f, k) and the ULP. +We can now easily compute the error and check if it is within 0.5 ULP as it +should be. Zero and infinites are handled similarly: + +- If the approximation is 0.0, the exact value should be *less or equal* + half the smallest denormal float: the smallest denormal floating point + number has an odd mantissa (00...001) and thus half of that is rounded + to 00...00, i.e., zero. +- If the approximation is Inf, the exact value should be *greater or equal* + to the largest finite float + 0.5 ULP: the largest finite float has an odd + mantissa (11...11), so that plus half an ULP is rounded up to the nearest + even number, which overflows. + +Implementation details +---------------------- +This directory contains a set of single-file Rust programs that perform +tests with a particular class of inputs. Each is compiled and run without +parameters, outputs (f64, f32, decimal) pairs to verify externally, and +in any case either exits gracefully or with a panic. + +If a test binary writes *anything at all* to stderr or exits with an +exit code that's not 0, the test fails. +The output on stdout is treated as (f64, f32, decimal) record, encoded thusly: + +- The first eight bytes are a binary64 (native endianness). +- The following four bytes are a binary32 (native endianness). +- Then the corresponding string input follows, in ASCII (no newline). +- The record is terminated with a newline. + +Incomplete records are an error. Not-a-Number bit patterns are invalid too. + +The tests run serially but the validaition for a a single test is parallelized +with ``multiprocessing``. Each test is launched as a subprocess. +One thread supervises it: Accepts and enqueues records to validate, observe +stderr, and waits for the process to exit. A set of worker processes perform +the validation work for the outputs enqueued there. Another thread listens +for progress updates from the workers. + +Known issues +------------ +Some errors (e.g., NaN outputs) aren't handled very gracefully. +Also, if there is an exception or the process is interrupted (at least on +Windows) the worker processes are leaked and stick around forever. +They're only a few megabytes each, but still, this script should not be run +if you aren't prepared to manually kill a lot of orphaned processes. +""" +from __future__ import print_function +import sys +import os.path +import time +import struct +from fractions import Fraction +from collections import namedtuple +from subprocess import Popen, check_call, PIPE +from glob import glob +import multiprocessing +import Queue +import threading +import ctypes +import binascii + +NUM_WORKERS = 2 +UPDATE_EVERY_N = 50000 +INF = namedtuple('INF', '')() +NEG_INF = namedtuple('NEG_INF', '')() +ZERO = namedtuple('ZERO', '')() +MAILBOX = None # The queue for reporting errors to the main process. +STDOUT_LOCK = threading.Lock() +test_name = None +child_processes = [] +exit_status = 0 + +def msg(*args): + with STDOUT_LOCK: + print("[" + test_name + "]", *args) + sys.stdout.flush() + + +def write_errors(): + global exit_status + f = open("errors.txt", 'w') + have_seen_error = False + while True: + args = MAILBOX.get() + if args is None: + f.close() + break + print(*args, file=f) + f.flush() + if not have_seen_error: + have_seen_error = True + msg("Something is broken:", *args) + msg("Future errors logged to errors.txt") + exit_status = 101 + + +def rustc(test): + rs = test + '.rs' + exe = test + '.exe' # hopefully this makes it work on *nix + print("compiling", test) + sys.stdout.flush() + check_call(['rustc', rs, '-o', exe]) + + +def run(test): + global test_name + test_name = test + + t0 = time.clock() + msg("setting up supervisor") + exe = test + '.exe' + proc = Popen(exe, bufsize=1<<20 , stdin=PIPE, stdout=PIPE, stderr=PIPE) + done = multiprocessing.Value(ctypes.c_bool) + queue = multiprocessing.Queue(maxsize=5)#(maxsize=1024) + workers = [] + for n in range(NUM_WORKERS): + worker = multiprocessing.Process(name='Worker-' + str(n + 1), + target=init_worker, + args=[test, MAILBOX, queue, done]) + workers.append(worker) + child_processes.append(worker) + for worker in workers: + worker.start() + msg("running test") + interact(proc, queue) + with done.get_lock(): + done.value = True + for worker in workers: + worker.join() + msg("python is done") + assert queue.empty(), "did not validate everything" + dt = time.clock() - t0 + msg("took", round(dt, 3), "seconds") + + +def interact(proc, queue): + line = "" + n = 0 + while proc.poll() is None: + line = proc.stdout.readline() + if not line: + continue + assert line.endswith('\n'), "incomplete line: " + repr(line) + queue.put(line) + line = "" + n += 1 + if n % UPDATE_EVERY_N == 0: + msg("got", str(n // 1000) + "k", "records") + msg("rust is done. exit code:", proc.returncode) + rest, stderr = proc.communicate() + if stderr: + msg("rust stderr output:", stderr) + for line in rest.split('\n'): + if not line: + continue + queue.put(line) + + +def main(): + global MAILBOX + tests = [os.path.splitext(f)[0] for f in glob('*.rs') + if not f.startswith('_')] + whitelist = sys.argv[1:] + if whitelist: + tests = [test for test in tests if test in whitelist] + if not tests: + print("Error: No tests to run") + sys.exit(1) + # Compile first for quicker feedback + for test in tests: + rustc(test) + # Set up mailbox once for all tests + MAILBOX = multiprocessing.Queue() + mailman = threading.Thread(target=write_errors) + mailman.daemon = True + mailman.start() + for test in tests: + if whitelist and test not in whitelist: + continue + run(test) + MAILBOX.put(None) + mailman.join() + + +# ---- Worker thread code ---- + + +POW2 = { e: Fraction(2) ** e for e in range(-1100, 1100) } +HALF_ULP = { e: (Fraction(2) ** e)/2 for e in range(-1100, 1100) } +DONE_FLAG = None + + +def send_error_to_supervisor(*args): + MAILBOX.put(args) + + +def init_worker(test, mailbox, queue, done): + global test_name, MAILBOX, DONE_FLAG + test_name = test + MAILBOX = mailbox + DONE_FLAG = done + do_work(queue) + + +def is_done(): + with DONE_FLAG.get_lock(): + return DONE_FLAG.value + + +def do_work(queue): + while True: + try: + line = queue.get(timeout=0.01) + except Queue.Empty: + if queue.empty() and is_done(): + return + else: + continue + bin64, bin32, text = line.rstrip().split() + validate(bin64, bin32, text) + + +def decode_binary64(x): + """ + Turn a IEEE 754 binary64 into (mantissa, exponent), except 0.0 and + infinity (positive and negative), which return ZERO, INF, and NEG_INF + respectively. + """ + x = binascii.unhexlify(x) + assert len(x) == 8, repr(x) + [bits] = struct.unpack(b'>Q', x) + if bits == 0: + return ZERO + exponent = (bits >> 52) & 0x7FF + negative = bits >> 63 + low_bits = bits & 0xFFFFFFFFFFFFF + if exponent == 0: + mantissa = low_bits + exponent += 1 + if mantissa == 0: + return ZERO + elif exponent == 0x7FF: + assert low_bits == 0, "NaN" + if negative: + return NEG_INF + else: + return INF + else: + mantissa = low_bits | (1 << 52) + exponent -= 1023 + 52 + if negative: + mantissa = -mantissa + return (mantissa, exponent) + + +def decode_binary32(x): + """ + Turn a IEEE 754 binary32 into (mantissa, exponent), except 0.0 and + infinity (positive and negative), which return ZERO, INF, and NEG_INF + respectively. + """ + x = binascii.unhexlify(x) + assert len(x) == 4, repr(x) + [bits] = struct.unpack(b'>I', x) + if bits == 0: + return ZERO + exponent = (bits >> 23) & 0xFF + negative = bits >> 31 + low_bits = bits & 0x7FFFFF + if exponent == 0: + mantissa = low_bits + exponent += 1 + if mantissa == 0: + return ZERO + elif exponent == 0xFF: + if negative: + return NEG_INF + else: + return INF + else: + mantissa = low_bits | (1 << 23) + exponent -= 127 + 23 + if negative: + mantissa = -mantissa + return (mantissa, exponent) + + +MIN_SUBNORMAL_DOUBLE = Fraction(2) ** -1074 +MIN_SUBNORMAL_SINGLE = Fraction(2) ** -149 # XXX unsure +MAX_DOUBLE = (2 - Fraction(2) ** -52) * (2 ** 1023) +MAX_SINGLE = (2 - Fraction(2) ** -23) * (2 ** 127) +MAX_ULP_DOUBLE = 1023 - 52 +MAX_ULP_SINGLE = 127 - 23 +DOUBLE_ZERO_CUTOFF = MIN_SUBNORMAL_DOUBLE / 2 +DOUBLE_INF_CUTOFF = MAX_DOUBLE + 2 ** (MAX_ULP_DOUBLE - 1) +SINGLE_ZERO_CUTOFF = MIN_SUBNORMAL_SINGLE / 2 +SINGLE_INF_CUTOFF = MAX_SINGLE + 2 ** (MAX_ULP_SINGLE - 1) + +def validate(bin64, bin32, text): + double = decode_binary64(bin64) + single = decode_binary32(bin32) + real = Fraction(text) + + if double is ZERO: + if real > DOUBLE_ZERO_CUTOFF: + record_special_error(text, "f64 zero") + elif double is INF: + if real < DOUBLE_INF_CUTOFF: + record_special_error(text, "f64 inf") + elif double is NEG_INF: + if -real < DOUBLE_INF_CUTOFF: + record_special_error(text, "f64 -inf") + elif len(double) == 2: + sig, k = double + validate_normal(text, real, sig, k, "f64") + else: + assert 0, "didn't handle binary64" + if single is ZERO: + if real > SINGLE_ZERO_CUTOFF: + record_special_error(text, "f32 zero") + elif single is INF: + if real < SINGLE_INF_CUTOFF: + record_special_error(text, "f32 inf") + elif single is NEG_INF: + if -real < SINGLE_INF_CUTOFF: + record_special_error(text, "f32 -inf") + elif len(single) == 2: + sig, k = single + validate_normal(text, real, sig, k, "f32") + else: + assert 0, "didn't handle binary32" + +def record_special_error(text, descr): + send_error_to_supervisor(text.strip(), "wrongly rounded to", descr) + + +def validate_normal(text, real, sig, k, kind): + approx = sig * POW2[k] + error = abs(approx - real) + if error > HALF_ULP[k]: + record_normal_error(text, error, k, kind) + + +def record_normal_error(text, error, k, kind): + one_ulp = HALF_ULP[k + 1] + assert one_ulp == 2 * HALF_ULP[k] + relative_error = error / one_ulp + text = text.strip() + try: + err_repr = float(relative_error) + except ValueError: + err_repr = str(err_repr).replace('/', ' / ') + send_error_to_supervisor(err_repr, "ULP error on", text, "(" + kind + ")") + + +if __name__ == '__main__': + main() diff --git a/src/etc/test-float-parse/short-decimals.rs b/src/etc/test-float-parse/short-decimals.rs new file mode 100644 index 00000000000..baefb9c9305 --- /dev/null +++ b/src/etc/test-float-parse/short-decimals.rs @@ -0,0 +1,29 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +mod _common; + +use _common::validate; + +fn main() { + // Skip e = 0 because small-u32 already does those. + for e in 1..301 { + for i in 0..10000 { + // If it ends in zeros, the parser will strip those (and adjust the exponent), + // which almost always (except for exponents near +/- 300) result in an input + // equivalent to something we already generate in a different way. + if i % 10 == 0 { + continue; + } + validate(format!("{}e{}", i, e)); + validate(format!("{}e-{}", i, e)); + } + } +} diff --git a/src/etc/test-float-parse/subnorm.rs b/src/etc/test-float-parse/subnorm.rs new file mode 100644 index 00000000000..70682c9b218 --- /dev/null +++ b/src/etc/test-float-parse/subnorm.rs @@ -0,0 +1,23 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +mod _common; + +use std::mem::transmute; +use _common::validate; + +fn main() { + for bits in 0u32..(1 << 21) { + let single: f32 = unsafe { transmute(bits) }; + validate(format!("{:e}", single)); + let double: f64 = unsafe { transmute(bits as u64) }; + validate(format!("{:e}", double)); + } +} diff --git a/src/etc/test-float-parse/tiny-pow10.rs b/src/etc/test-float-parse/tiny-pow10.rs new file mode 100644 index 00000000000..a01c6d5a078 --- /dev/null +++ b/src/etc/test-float-parse/tiny-pow10.rs @@ -0,0 +1,21 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +mod _common; + +use _common::validate; + +fn main() { + for e in 301..327 { + for i in 0..100000 { + validate(format!("{}e-{}", i, e)); + } + } +} diff --git a/src/etc/test-float-parse/u32-small.rs b/src/etc/test-float-parse/u32-small.rs new file mode 100644 index 00000000000..a4e8488e745 --- /dev/null +++ b/src/etc/test-float-parse/u32-small.rs @@ -0,0 +1,19 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +mod _common; + +use _common::validate; + +fn main() { + for i in 0..(1 << 19) { + validate(i.to_string()); + } +} diff --git a/src/etc/test-float-parse/u64-pow2.rs b/src/etc/test-float-parse/u64-pow2.rs new file mode 100644 index 00000000000..a31304d3f68 --- /dev/null +++ b/src/etc/test-float-parse/u64-pow2.rs @@ -0,0 +1,28 @@ +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or +// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license +// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +mod _common; + +use _common::validate; +use std::u64; + +fn main() { + for exp in 19..64 { + let power: u64 = 1 << exp; + validate(power.to_string()); + for offset in 1..123 { + validate((power + offset).to_string()); + validate((power - offset).to_string()); + } + } + for offset in 0..123 { + validate((u64::MAX - offset).to_string()); + } +} |
