Make float::from_bits transmute (and update the documentation to reflect this).

The current implementation/documentation was made to avoid sNaN because of potential safety issues implied by old/bad LLVM documentation. These issues aren't real, so we can just make the implementation transmute (as permitted by the existing documentation of this method). Also the documentation didn't actually match the behaviour: it said we may change sNaNs, but in fact we canonicalized *all* NaNs. Also an example in the documentation was wrong: it said we *always* change sNaNs, when the documentation was explicitly written to indicate it was implementation-defined. This makes to_bits and from_bits perfectly roundtrip cross-platform, except for one caveat: although the 2008 edition of IEEE-754 specifies how to interpet the signaling bit, earlier editions didn't. This lead to some platforms picking the opposite interpretation, so all signaling NaNs on x86/ARM are quiet on MIPS, and vice-versa. NaN-boxing is a fairly important optimization, while we don't even guarantee that float operations properly preserve signalingness. As such, this seems like the more natural strategy to take (as opposed to trying to mangle the signaling bit on a per-platform basis). This implementation is also, of course, faster.
author: Alexis Beingessner <a.beingessner@gmail.com> 2017-11-15 15:05:47 -0500
committer: Alexis Beingessner <a.beingessner@gmail.com> 2017-11-23 16:55:52 -0500
commit: 439576fd7bedf741db5fb6a21c902e858d51f2a0 (patch)
tree: 20e757796ce44c0ff3af226a2388e7f11e3123b6
parent: 88a28ff6028cf197ed6b4185d8cd4887f05e3e07 (diff)
download: rust-439576fd7bedf741db5fb6a21c902e858d51f2a0.tar.gz
rust-439576fd7bedf741db5fb6a21c902e858d51f2a0.zip
2 files changed, 86 insertions, 76 deletions
diff --git a/src/libstd/f32.rs b/src/libstd/f32.rs
index 7ec6124dfa4..42182092ab1 100644
--- a/src/libstd/f32.rs
+++ b/src/libstd/f32.rs
@@ -998,10 +998,13 @@ impl f32 {
 
     /// Raw transmutation to `u32`.
     ///
-    /// Converts the `f32` into its raw memory representation,
-    /// similar to the `transmute` function.
+    /// This is currently identical to `transmute::<f32, u32>(self)` on all platforms.
     ///
-    /// Note that this function is distinct from casting.
+    /// See `from_bits` for some discussion of the portability of this operation
+    /// (there are almost no issues).
+    ///
+    /// Note that this function is distinct from `as` casting, which attempts to
+    /// preserve the *numeric* value, and not the bitwise value.
     ///
     /// # Examples
     ///
@@ -1018,17 +1021,33 @@ impl f32 {
 
     /// Raw transmutation from `u32`.
     ///
-    /// Converts the given `u32` containing the float's raw memory
-    /// representation into the `f32` type, similar to the
-    /// `transmute` function.
+    /// This is currently identical to `transmute::<u32, f32>(v)` on all platforms.
+    /// It turns out this is incredibly portable, for two reasons:
+    ///
+    /// * Floats and Ints have the same endianess on all supported platforms.
+    /// * IEEE-754 very precisely specifies the bit layout of floats.
+    ///
+    /// However there is one caveat: prior to the 2008 version of IEEE-754, how
+    /// to interpret the NaN signaling bit wasn't actually specified. Most platforms
+    /// (notably x86 and ARM) picked the interpretation that was ultimately
+    /// standardized in 2008, but some didn't (notably MIPS). As a result, all
+    /// signaling NaNs on MIPS are quiet NaNs on x86, and vice-versa.
+    ///
+    /// Rather than trying to preserve signaling-ness cross-platform, this
+    /// implementation favours preserving the exact bits. This means that
+    /// any payloads encoded in NaNs will be preserved even if the result of
+    /// this method is sent over the network from an x86 machine to a MIPS one.
+    ///
+    /// If the results of this method are only manipulated by the same
+    /// architecture that produced them, then there is no portability concern.
+    ///
+    /// If the input isn't NaN, then there is no portability concern.
     ///
-    /// There is only one difference to a bare `transmute`:
-    /// Due to the implications onto Rust's safety promises being
-    /// uncertain, if the representation of a signaling NaN "sNaN" float
-    /// is passed to the function, the implementation is allowed to
-    /// return a quiet NaN instead.
+    /// If you don't care about signalingness (very likely), then there is no
+    /// portability concern.
     ///
-    /// Note that this function is distinct from casting.
+    /// Note that this function is distinct from `as` casting, which attempts to
+    /// preserve the *numeric* value, and not the bitwise value.
     ///
     /// # Examples
     ///
@@ -1037,25 +1056,11 @@ impl f32 {
     /// let v = f32::from_bits(0x41480000);
     /// let difference = (v - 12.5).abs();
     /// assert!(difference <= 1e-5);
-    /// // Example for a signaling NaN value:
-    /// let snan = 0x7F800001;
-    /// assert_ne!(f32::from_bits(snan).to_bits(), snan);
     /// ```
     #[stable(feature = "float_bits_conv", since = "1.20.0")]
     #[inline]
-    pub fn from_bits(mut v: u32) -> Self {
-        const EXP_MASK: u32   = 0x7F800000;
-        const FRACT_MASK: u32 = 0x007FFFFF;
-        if v & EXP_MASK == EXP_MASK && v & FRACT_MASK != 0 {
-            // While IEEE 754-2008 specifies encodings for quiet NaNs
-            // and signaling ones, certain MIPS and PA-RISC
-            // CPUs treat signaling NaNs differently.
-            // Therefore to be safe, we pass a known quiet NaN
-            // if v is any kind of NaN.
-            // The check above only assumes IEEE 754-1985 to be
-            // valid.
-            v = unsafe { ::mem::transmute(NAN) };
-        }
+    pub fn from_bits(v: u32) -> Self {
+        // It turns out the safety issues with sNaN were overblown! Hooray!
         unsafe { ::mem::transmute(v) }
     }
 }
@@ -1646,25 +1651,15 @@ mod tests {
         assert_approx_eq!(f32::from_bits(0x41480000), 12.5);
         assert_approx_eq!(f32::from_bits(0x44a72000), 1337.0);
         assert_approx_eq!(f32::from_bits(0xc1640000), -14.25);
-    }
-    #[test]
-    fn test_snan_masking() {
-        // NOTE: this test assumes that our current platform
-        // implements IEEE 754-2008 that specifies the difference
-        // in encoding of quiet and signaling NaNs.
-        // If you are porting Rust to a platform that does not
-        // implement IEEE 754-2008 (but e.g. IEEE 754-1985, which
-        // only says that "Signaling NaNs shall be reserved operands"
-        // but doesn't specify the actual setup), feel free to
-        // cfg out this test.
-        let snan: u32 = 0x7F801337;
-        const QNAN_MASK: u32  = 0x00400000;
-        let nan_masked_fl = f32::from_bits(snan);
-        let nan_masked = nan_masked_fl.to_bits();
-        // Ensure that signaling NaNs don't stay the same
-        assert_ne!(nan_masked, snan);
-        // Ensure that we have a quiet NaN
-        assert_ne!(nan_masked & QNAN_MASK, 0);
-        assert!(nan_masked_fl.is_nan());
+
+        // Check that NaNs roundtrip their bits regardless of signalingness
+        // 0xA is 0b1010; 0x5 is 0b0101 -- so these two together clobbers all the mantissa bits
+        let masked_nan1 = f32::NAN.to_bits() ^ 0x002A_AAAA;
+        let masked_nan2 = f32::NAN.to_bits() ^ 0x0055_5555;
+        assert!(f32::from_bits(masked_nan1).is_nan());
+        assert!(f32::from_bits(masked_nan2).is_nan());
+
+        assert_eq!(f32::from_bits(masked_nan1).to_bits(), masked_nan1);
+        assert_eq!(f32::from_bits(masked_nan2).to_bits(), masked_nan2);
     }
 }
diff --git a/src/libstd/f64.rs b/src/libstd/f64.rs
index e0f0acc6ac4..d3a6ed60788 100644
--- a/src/libstd/f64.rs
+++ b/src/libstd/f64.rs
@@ -952,10 +952,13 @@ impl f64 {
 
     /// Raw transmutation to `u64`.
     ///
-    /// Converts the `f64` into its raw memory representation,
-    /// similar to the `transmute` function.
+    /// This is currently identical to `transmute::<f64, u64>(self)` on all platforms.
     ///
-    /// Note that this function is distinct from casting.
+    /// See `from_bits` for some discussion of the portability of this operation
+    /// (there are almost no issues).
+    ///
+    /// Note that this function is distinct from `as` casting, which attempts to
+    /// preserve the *numeric* value, and not the bitwise value.
     ///
     /// # Examples
     ///
@@ -972,17 +975,33 @@ impl f64 {
 
     /// Raw transmutation from `u64`.
     ///
-    /// Converts the given `u64` containing the float's raw memory
-    /// representation into the `f64` type, similar to the
-    /// `transmute` function.
+    /// This is currently identical to `transmute::<u64, f64>(v)` on all platforms.
+    /// It turns out this is incredibly portable, for two reasons:
+    ///
+    /// * Floats and Ints have the same endianess on all supported platforms.
+    /// * IEEE-754 very precisely specifies the bit layout of floats.
+    ///
+    /// However there is one caveat: prior to the 2008 version of IEEE-754, how
+    /// to interpret the NaN signaling bit wasn't actually specified. Most platforms
+    /// (notably x86 and ARM) picked the interpretation that was ultimately
+    /// standardized in 2008, but some didn't (notably MIPS). As a result, all
+    /// signaling NaNs on MIPS are quiet NaNs on x86, and vice-versa.
+    ///
+    /// Rather than trying to preserve signaling-ness cross-platform, this
+    /// implementation favours preserving the exact bits. This means that
+    /// any payloads encoded in NaNs will be preserved even if the result of
+    /// this method is sent over the network from an x86 machine to a MIPS one.
+    ///
+    /// If the results of this method are only manipulated by the same
+    /// architecture that produced them, then there is no portability concern.
     ///
-    /// There is only one difference to a bare `transmute`:
-    /// Due to the implications onto Rust's safety promises being
-    /// uncertain, if the representation of a signaling NaN "sNaN" float
-    /// is passed to the function, the implementation is allowed to
-    /// return a quiet NaN instead.
+    /// If the input isn't NaN, then there is no portability concern.
     ///
-    /// Note that this function is distinct from casting.
+    /// If you don't care about signalingness (very likely), then there is no
+    /// portability concern.
+    ///
+    /// Note that this function is distinct from `as` casting, which attempts to
+    /// preserve the *numeric* value, and not the bitwise value.
     ///
     /// # Examples
     ///
@@ -991,25 +1010,11 @@ impl f64 {
     /// let v = f64::from_bits(0x4029000000000000);
     /// let difference = (v - 12.5).abs();
     /// assert!(difference <= 1e-5);
-    /// // Example for a signaling NaN value:
-    /// let snan = 0x7FF0000000000001;
-    /// assert_ne!(f64::from_bits(snan).to_bits(), snan);
     /// ```
     #[stable(feature = "float_bits_conv", since = "1.20.0")]
     #[inline]
-    pub fn from_bits(mut v: u64) -> Self {
-        const EXP_MASK: u64   = 0x7FF0000000000000;
-        const FRACT_MASK: u64 = 0x000FFFFFFFFFFFFF;
-        if v & EXP_MASK == EXP_MASK && v & FRACT_MASK != 0 {
-            // While IEEE 754-2008 specifies encodings for quiet NaNs
-            // and signaling ones, certain MIPS and PA-RISC
-            // CPUs treat signaling NaNs differently.
-            // Therefore to be safe, we pass a known quiet NaN
-            // if v is any kind of NaN.
-            // The check above only assumes IEEE 754-1985 to be
-            // valid.
-            v = unsafe { ::mem::transmute(NAN) };
-        }
+    pub fn from_bits(v: u64) -> Self {
+        // It turns out the safety issues with sNaN were overblown! Hooray!
         unsafe { ::mem::transmute(v) }
     }
 }
@@ -1596,5 +1601,15 @@ mod tests {
         assert_approx_eq!(f64::from_bits(0x4029000000000000), 12.5);
         assert_approx_eq!(f64::from_bits(0x4094e40000000000), 1337.0);
         assert_approx_eq!(f64::from_bits(0xc02c800000000000), -14.25);
+
+        // Check that NaNs roundtrip their bits regardless of signalingness
+        // 0xA is 0b1010; 0x5 is 0b0101 -- so these two together clobbers all the mantissa bits
+        let masked_nan1 = f64::NAN.to_bits() ^ 0x000A_AAAA_AAAA_AAAA;
+        let masked_nan2 = f64::NAN.to_bits() ^ 0x0005_5555_5555_5555;
+        assert!(f64::from_bits(masked_nan1).is_nan());
+        assert!(f64::from_bits(masked_nan2).is_nan());
+
+        assert_eq!(f64::from_bits(masked_nan1).to_bits(), masked_nan1);
+        assert_eq!(f64::from_bits(masked_nan2).to_bits(), masked_nan2);
     }
 }
author	Alexis Beingessner <a.beingessner@gmail.com>	2017-11-15 15:05:47 -0500
committer	Alexis Beingessner <a.beingessner@gmail.com>	2017-11-23 16:55:52 -0500
commit	439576fd7bedf741db5fb6a21c902e858d51f2a0 (patch)
tree	20e757796ce44c0ff3af226a2388e7f11e3123b6
parent	88a28ff6028cf197ed6b4185d8cd4887f05e3e07 (diff)
download	rust-439576fd7bedf741db5fb6a21c902e858d51f2a0.tar.gz rust-439576fd7bedf741db5fb6a21c902e858d51f2a0.zip