Author: Roberto Nerici
Date: 04:42:58 02/20/04
Go up one level in this thread
On February 19, 2004 at 19:24:10, Russell Reagan wrote: >On February 19, 2004 at 18:16:17, Dann Corbit wrote: > >>You might want to examine here and below for an alternative pure C model that >>seems to be just about the same speed: >>http://www.talkchess.com/forums/1/message.html?349781 > >I agree. The performance of the bsf/bsr instructions seems to depend greatly on >what brand of CPU you are using. For instance, on a PIII, it is about ten times >faster than it is on an Athlon. On the Athlon, the C versions are usually >faster. I don't know about the P4. > >This is the one I usually use. Eugene Nalimov wrote it for the Itanium, but it >runs about as fast or faster than most of the other routines I've tried on >32-bit hardware. It requires one table lookup, but the table is only 256 >elements so it is cache friendly. The table data is just the first bit of the >byte. > [code snipped] >Have a look here for some others that I compared. There are comparisons for both >32-bit and 64-bit hardware. > >http://chessprogramming.org/cccsearch/ccc.php?art_id=333679 This was an interesting test that you did. However, my results are a bit different. On a P3-733 and using MSVC6, I found the "Gerd" test the fastest by a fair bit, followed by "table16" and "eugene2", with "eugene" (the one you're using) slower still. I had to make some very minor changes to get it to compile under VC6, but nothing that should make a difference. However, I did then make one further change: I declared the scan routines as __inline and this speeded them all up compared to letting the compiler inline what it wanted. Roberto/.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.