Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty profits little from Itanium and Opteron versus Commercials

Author: Robert Hyatt

Date: 08:17:46 08/07/03

Go up one level in this thread


On August 07, 2003 at 08:08:57, Sune Fischer wrote:

>On August 07, 2003 at 07:49:20, Vincent Diepeveen wrote:
>
>>>65% is disappointing for Crafty. It should get the 65% from register and cache
>>>like all other programs, plus whatever speedup the bitboards get.
>>
>>Wrong.
>>
>>Non bitboarders like diep have also a lot of potential for more complex cpu's.
>
>Of course.
>
>> a) better usage of more registers (crafty is doing things very simple see code)
>> b) better usage of PGO (profile guided optimizations)
>
>Yes I think this might actually help non-bitboarders more, I got nothing from
>PGO last I tested, I don't really have that many branches either, mostly tables.

Crafty gains about 15% from PGO on the Intel compiler.  Of course, if you
read vincent's "profiling methodology" you might see why it fails for him.



>
>> c) better icache usage (crafty fits within icache, nothing to optimize)
>
>Are you sure about that?
>The rotated tables alone are 512 kB afaik.
>
>> d) better bpt (branch prediction table) usage. crafty fits in simply
>>    smaller bpt's.
>>
>>My guess is sjeng is so fast because of the more registers and the bigger bpt in
>>combination with faster (75%) random latency. measured with dieter's dblat
>>program. 229 ns latency at opteron with 500MB cache. versus 400 at my dual K7
>>machine. dual xeon also is giving around 400.
>>
>>Do not underestimate point d.
>>
>>Crafty is a simple program. Main profit directly 33% is 32==>64 bits. Then RAM
>>latency does the vaste rest.
>>
>>Please compare speedup of crafty from K7 to itanium2 cpu:
>>  itanium2 madison 1.3Ghz == 907
>>  MP2400 2.0Ghz           == 1156
>>  XP2100 1.73Ghz          == 1022
>>  XP1700 1.46Ghz          == 867
>>  XP1800 1.6Ghz           == 903
>>
>>So for crafty K7 1.6Ghz == Itanium2 madison 1.3Ghz
>>
>>Now diep way more complex program can profit more from complex cpu and
>>the PGO and loops:
>>  K7 2Ghz == itanium2 1.3Ghz
>>
>>Crafty profits less from new generation cpu's than complex commercial programs
>>in short.
>>
>>Point made clear?
>
>You seem to have missed one crucial point.
>Crafty is 64 bit prog, which means it's slow on 32 bit, even I have found that
>doing a lookup is faster than shifting, I simply never do 1<<sq, I use a table
>for that. Little things like that are all over the program, when I remove this
>and go pure 64 bit I do think a factor 2 clock for clock is reachable.
>
>>See for crafty specint:
>
>After I saw they tested with 32 bit binaries, I'm not prepared to give them much
>credit.
>Frankly I want Eugene or Hyatt to produce the binary, needs to be done right or
>you lose 30% real quick. The pure C version is a lowest common denominator
>compile, it sucks basicly.
>
>I also want to see other bitboard progs, I'm not sure Crafty is representative
>for all, my program is very different, for better or worse of course.
>
>-S.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.