Currently, LLVM lowers a cttz8 on x86_64 to these instructions:

```asm movzbl %dil, %eax bsfl %eax, %eax movl $32, %ecx cmovnel %eax, %ecx cmpl $32, %ecx movl $8, %eax cmovnel %ecx, %eax ``` which has some unnecessary overhead, having two conditional moves. To improve the codegen, we can zero extend the 8 bit integer, then set bit 8 and perform a cttz operation on the extended value. That way there's no conditional operation involved at all.
author: Björn Steinbrink <bsteinbr@gmail.com> 2015-04-27 00:18:02 +0200
committer: Björn Steinbrink <bsteinbr@gmail.com> 2015-04-29 14:45:23 +0200
commit: 36dccec2f39c7e1da7f056ea421ad5256df3fb0b (patch)
tree: 7a3cf65a4e09ba2c61f5425126e254faa2e1060f /src/libcore/num
parent: 8f991d1fc27a176254ebfe99ed7e5a339cb9c7e2 (diff)
download: rust-36dccec2f39c7e1da7f056ea421ad5256df3fb0b.tar.gz
rust-36dccec2f39c7e1da7f056ea421ad5256df3fb0b.zip
1 files changed, 14 insertions, 1 deletions
diff --git a/src/libcore/num/mod.rs b/src/libcore/num/mod.rs
index 44d5333ce1f..b8638c5b09b 100644
--- a/src/libcore/num/mod.rs
+++ b/src/libcore/num/mod.rs
@@ -745,7 +745,20 @@ macro_rules! uint_impl {
         #[stable(feature = "rust1", since = "1.0.0")]
         #[inline]
         pub fn trailing_zeros(self) -> u32 {
-            unsafe { $cttz(self as $ActualT) as u32 }
+            // As of LLVM 3.6 the codegen for the zero-safe cttz8 intrinsic
+            // emits two conditional moves on x86_64. By promoting the value to
+            // u16 and setting bit 8, we get better code without any conditional
+            // operations.
+            // FIXME: There's a LLVM patch (http://reviews.llvm.org/D9284)
+            // pending, remove this workaround once LLVM generates better code
+            // for cttz8.
+            unsafe {
+                if $BITS == 8 {
+                    intrinsics::cttz16(self as u16 | 0x100) as u32
+                } else {
+                    $cttz(self as $ActualT) as u32
+                }
+            }
         }
 
         /// Shifts the bits to the left by a specified amount, `n`,
author	Björn Steinbrink <bsteinbr@gmail.com>	2015-04-27 00:18:02 +0200
committer	Björn Steinbrink <bsteinbr@gmail.com>	2015-04-29 14:45:23 +0200
commit	36dccec2f39c7e1da7f056ea421ad5256df3fb0b (patch)
tree	7a3cf65a4e09ba2c61f5425126e254faa2e1060f /src/libcore/num
parent	8f991d1fc27a176254ebfe99ed7e5a339cb9c7e2 (diff)
download	rust-36dccec2f39c7e1da7f056ea421ad5256df3fb0b.tar.gz rust-36dccec2f39c7e1da7f056ea421ad5256df3fb0b.zip