Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Puzzling Assembler Question

Author: Roberto Nerici

Date: 04:42:58 02/20/04

Go up one level in this thread


On February 19, 2004 at 19:24:10, Russell Reagan wrote:

>On February 19, 2004 at 18:16:17, Dann Corbit wrote:
>
>>You might want to examine here and below for an alternative pure C model that
>>seems to be just about the same speed:
>>http://www.talkchess.com/forums/1/message.html?349781
>
>I agree. The performance of the bsf/bsr instructions seems to depend greatly on
>what brand of CPU you are using. For instance, on a PIII, it is about ten times
>faster than it is on an Athlon. On the Athlon, the C versions are usually
>faster. I don't know about the P4.
>
>This is the one I usually use. Eugene Nalimov wrote it for the Itanium, but it
>runs about as fast or faster than most of the other routines I've tried on
>32-bit hardware. It requires one table lookup, but the table is only 256
>elements so it is cache friendly. The table data is just the first bit of the
>byte.
>

[code snipped]

>Have a look here for some others that I compared. There are comparisons for both
>32-bit and 64-bit hardware.
>
>http://chessprogramming.org/cccsearch/ccc.php?art_id=333679

This was an interesting test that you did. However, my results are a bit
different. On a P3-733 and using MSVC6, I found the "Gerd" test the fastest by a
fair bit, followed by "table16" and "eugene2", with "eugene" (the one you're
using) slower still.

I had to make some very minor changes to get it to compile under VC6, but
nothing that should make a difference. However, I did then make one further
change: I declared the scan routines as __inline and this speeded them all up
compared to letting the compiler inline what it wanted.

Roberto/.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.