Author: Gerd Isenberg
Date: 06:37:21 07/11/03
Go up one level in this thread
On July 11, 2003 at 09:03:06, Walter Faxon wrote: >On July 11, 2003 at 03:30:33, Gerd Isenberg wrote: > >>On July 10, 2003 at 21:05:09, Russell Reagan wrote: >> >>>Thanks Gerd. This method seems approximately equal to the bsf assembler method >>>in speed on my Athlon (maybe a hair slower). Maybe this will be the fastest on a >>>64-bit cpu? >> >>On x86-32 Walter's routine should be faster - due the mul64 call with three 32 >>bit muls (vector path on athlon). On x86-64 there is only one mul rax, and that >>is double direct path. >> >>Gerd > > >Matt does it better by doing one fold prior just a 32-bit multiply. This email >is from him to me, I think quoting from an email he wrote to >http://www.hackersdelight.org/ , based on the book and a good source for bit >hackers. > <snip> Thanks, Walter I see, wonder why the compiler is not able to optimize the const multiplication, if only one 32bit part of the 128-bit result is relevant ;-) Regards, Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.