diff options
| author | Björn Steinbrink <bsteinbr@gmail.com> | 2015-04-27 00:18:02 +0200 |
|---|---|---|
| committer | Björn Steinbrink <bsteinbr@gmail.com> | 2015-04-29 14:45:23 +0200 |
| commit | 36dccec2f39c7e1da7f056ea421ad5256df3fb0b (patch) | |
| tree | 7a3cf65a4e09ba2c61f5425126e254faa2e1060f /src/libcore/num | |
| parent | 8f991d1fc27a176254ebfe99ed7e5a339cb9c7e2 (diff) | |
| download | rust-36dccec2f39c7e1da7f056ea421ad5256df3fb0b.tar.gz rust-36dccec2f39c7e1da7f056ea421ad5256df3fb0b.zip | |
Currently, LLVM lowers a cttz8 on x86_64 to these instructions:
```asm
movzbl %dil, %eax
bsfl %eax, %eax
movl $32, %ecx
cmovnel %eax, %ecx
cmpl $32, %ecx
movl $8, %eax
cmovnel %ecx, %eax
```
which has some unnecessary overhead, having two conditional moves.
To improve the codegen, we can zero extend the 8 bit integer, then set
bit 8 and perform a cttz operation on the extended value. That way
there's no conditional operation involved at all.
Diffstat (limited to 'src/libcore/num')
| -rw-r--r-- | src/libcore/num/mod.rs | 15 |
1 files changed, 14 insertions, 1 deletions
diff --git a/src/libcore/num/mod.rs b/src/libcore/num/mod.rs index 44d5333ce1f..b8638c5b09b 100644 --- a/src/libcore/num/mod.rs +++ b/src/libcore/num/mod.rs @@ -745,7 +745,20 @@ macro_rules! uint_impl { #[stable(feature = "rust1", since = "1.0.0")] #[inline] pub fn trailing_zeros(self) -> u32 { - unsafe { $cttz(self as $ActualT) as u32 } + // As of LLVM 3.6 the codegen for the zero-safe cttz8 intrinsic + // emits two conditional moves on x86_64. By promoting the value to + // u16 and setting bit 8, we get better code without any conditional + // operations. + // FIXME: There's a LLVM patch (http://reviews.llvm.org/D9284) + // pending, remove this workaround once LLVM generates better code + // for cttz8. + unsafe { + if $BITS == 8 { + intrinsics::cttz16(self as u16 | 0x100) as u32 + } else { + $cttz(self as $ActualT) as u32 + } + } } /// Shifts the bits to the left by a specified amount, `n`, |
