Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: some other points adressed

Author: Robert Hyatt

Date: 08:20:32 08/07/03

Go up one level in this thread


On August 07, 2003 at 08:46:43, Vincent Diepeveen wrote:

>On August 07, 2003 at 08:08:57, Sune Fischer wrote:
>
>>On August 07, 2003 at 07:49:20, Vincent Diepeveen wrote:
>>
>>>>65% is disappointing for Crafty. It should get the 65% from register and cache
>>>>like all other programs, plus whatever speedup the bitboards get.
>>>
>>>Wrong.
>>>
>>>Non bitboarders like diep have also a lot of potential for more complex cpu's.
>>
>>Of course.
>>
>>> a) better usage of more registers (crafty is doing things very simple see code)
>>> b) better usage of PGO (profile guided optimizations)
>>
>>Yes I think this might actually help non-bitboarders more, I got nothing from
>>PGO last I tested, I don't really have that many branches either, mostly tables.
>>
>>> c) better icache usage (crafty fits within icache, nothing to optimize)
>>
>>Are you sure about that?
>>The rotated tables alone are 512 kB afaik.
>
>i said icache Sune. You refer to dcache.
>
>>> d) better bpt (branch prediction table) usage. crafty fits in simply
>>>    smaller bpt's.
>>>
>>>My guess is sjeng is so fast because of the more registers and the bigger bpt in
>>>combination with faster (75%) random latency. measured with dieter's dblat
>>>program. 229 ns latency at opteron with 500MB cache. versus 400 at my dual K7
>>>machine. dual xeon also is giving around 400.
>>>
>>>Do not underestimate point d.
>>>
>>>Crafty is a simple program. Main profit directly 33% is 32==>64 bits. Then RAM
>>>latency does the vaste rest.
>>>
>>>Please compare speedup of crafty from K7 to itanium2 cpu:
>>>  itanium2 madison 1.3Ghz == 907
>>>  MP2400 2.0Ghz           == 1156
>>>  XP2100 1.73Ghz          == 1022
>>>  XP1700 1.46Ghz          == 867
>>>  XP1800 1.6Ghz           == 903
>>>
>>>So for crafty K7 1.6Ghz == Itanium2 madison 1.3Ghz
>>>
>>>Now diep way more complex program can profit more from complex cpu and
>>>the PGO and loops:
>>>  K7 2Ghz == itanium2 1.3Ghz
>>>
>>>Crafty profits less from new generation cpu's than complex commercial programs
>>>in short.
>>>
>>>Point made clear?
>
>>You seem to have missed one crucial point.
>>Crafty is 64 bit prog, which means it's slow on 32 bit, even I have found that
>>doing a lookup is faster than shifting, I simply never do 1<<sq, I use a table
>
>that's 33% at most. Just look to what the alpha 21264c scores versus similar
>architecture K7. 33% difference about.
>
>Itanium is a new generation complex cpu. too complex for bitboarders and it's
>latency to main memory isn't very impressive which is bad luck for you.
>
>what you need is fast ram.
>
>Ever measured the difference between RAM speeds for your thing Sune?
>
>You should.
>
>So measure LATENCY differences. If a machine X has 220 ns latency versus some
>other machine has 400 ns latency. Just measure what it speeds up for you.
>
>
>
>>for that. Little things like that are all over the program, when I remove this
>>and go pure 64 bit I do think a factor 2 clock for clock is reachable.
>
>no. perhaps 33% for just going from 32 to 64 bits. You're really underestimating
>how fast the overhead runs at the K7 here.
>
>In those instructions there is very little branch mis predictions little
>register stalls etc. It's all just a few more instructions code that 64 bits at
>32 bits processors.
>
>the datastructure itself however is a slow thing when compared to non-bitboard.
>that's however a different discussion.
>
>>>See for crafty specint:
>>
>>After I saw they tested with 32 bit binaries, I'm not prepared to give them much
>>credit.
>>Frankly I want Eugene or Hyatt to produce the binary, needs to be done right or
>>you lose 30% real quick. The pure C version is a lowest common denominator
>>compile, it sucks basicly.
>
>Hyatt i wouldn't trust producing a textfile with speedup numbers even, but aside
>from that yes Eugene probably has some cool executables from crafty.
>
>>I also want to see other bitboard progs, I'm not sure Crafty is representative
>
>crafty is very poor example of programming:
>  - inline assembly everywhere
>  - no nice loops but all written out black & white even
>  - every piece written out
>  - it doesn't compile very well with gcc or visual c++ thanks to
>    all of that hacking and hyatt doesn't care frankly.
>

Right.  It won't compile on linux, windows, AIX, HP-UX, IRIX, UNICOS,
Solaris, VMS, Macintosh, Cray, True-64.  It Really only compiles on
linux, and really only on my particular machine in my office.



>    To quote him: "My dual P4 xeon is what counts".
>
>>for all, my program is very different, for better or worse of course.
>>-S.
>
>Let's hope you don't have the same mistakes. the bad example of crafty is really
>that people start writing their own assembly. As if bitboards are fast anyway :)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.