Author: Robert Hyatt
Date: 08:20:32 08/07/03
Go up one level in this thread
On August 07, 2003 at 08:46:43, Vincent Diepeveen wrote: >On August 07, 2003 at 08:08:57, Sune Fischer wrote: > >>On August 07, 2003 at 07:49:20, Vincent Diepeveen wrote: >> >>>>65% is disappointing for Crafty. It should get the 65% from register and cache >>>>like all other programs, plus whatever speedup the bitboards get. >>> >>>Wrong. >>> >>>Non bitboarders like diep have also a lot of potential for more complex cpu's. >> >>Of course. >> >>> a) better usage of more registers (crafty is doing things very simple see code) >>> b) better usage of PGO (profile guided optimizations) >> >>Yes I think this might actually help non-bitboarders more, I got nothing from >>PGO last I tested, I don't really have that many branches either, mostly tables. >> >>> c) better icache usage (crafty fits within icache, nothing to optimize) >> >>Are you sure about that? >>The rotated tables alone are 512 kB afaik. > >i said icache Sune. You refer to dcache. > >>> d) better bpt (branch prediction table) usage. crafty fits in simply >>> smaller bpt's. >>> >>>My guess is sjeng is so fast because of the more registers and the bigger bpt in >>>combination with faster (75%) random latency. measured with dieter's dblat >>>program. 229 ns latency at opteron with 500MB cache. versus 400 at my dual K7 >>>machine. dual xeon also is giving around 400. >>> >>>Do not underestimate point d. >>> >>>Crafty is a simple program. Main profit directly 33% is 32==>64 bits. Then RAM >>>latency does the vaste rest. >>> >>>Please compare speedup of crafty from K7 to itanium2 cpu: >>> itanium2 madison 1.3Ghz == 907 >>> MP2400 2.0Ghz == 1156 >>> XP2100 1.73Ghz == 1022 >>> XP1700 1.46Ghz == 867 >>> XP1800 1.6Ghz == 903 >>> >>>So for crafty K7 1.6Ghz == Itanium2 madison 1.3Ghz >>> >>>Now diep way more complex program can profit more from complex cpu and >>>the PGO and loops: >>> K7 2Ghz == itanium2 1.3Ghz >>> >>>Crafty profits less from new generation cpu's than complex commercial programs >>>in short. >>> >>>Point made clear? > >>You seem to have missed one crucial point. >>Crafty is 64 bit prog, which means it's slow on 32 bit, even I have found that >>doing a lookup is faster than shifting, I simply never do 1<<sq, I use a table > >that's 33% at most. Just look to what the alpha 21264c scores versus similar >architecture K7. 33% difference about. > >Itanium is a new generation complex cpu. too complex for bitboarders and it's >latency to main memory isn't very impressive which is bad luck for you. > >what you need is fast ram. > >Ever measured the difference between RAM speeds for your thing Sune? > >You should. > >So measure LATENCY differences. If a machine X has 220 ns latency versus some >other machine has 400 ns latency. Just measure what it speeds up for you. > > > >>for that. Little things like that are all over the program, when I remove this >>and go pure 64 bit I do think a factor 2 clock for clock is reachable. > >no. perhaps 33% for just going from 32 to 64 bits. You're really underestimating >how fast the overhead runs at the K7 here. > >In those instructions there is very little branch mis predictions little >register stalls etc. It's all just a few more instructions code that 64 bits at >32 bits processors. > >the datastructure itself however is a slow thing when compared to non-bitboard. >that's however a different discussion. > >>>See for crafty specint: >> >>After I saw they tested with 32 bit binaries, I'm not prepared to give them much >>credit. >>Frankly I want Eugene or Hyatt to produce the binary, needs to be done right or >>you lose 30% real quick. The pure C version is a lowest common denominator >>compile, it sucks basicly. > >Hyatt i wouldn't trust producing a textfile with speedup numbers even, but aside >from that yes Eugene probably has some cool executables from crafty. > >>I also want to see other bitboard progs, I'm not sure Crafty is representative > >crafty is very poor example of programming: > - inline assembly everywhere > - no nice loops but all written out black & white even > - every piece written out > - it doesn't compile very well with gcc or visual c++ thanks to > all of that hacking and hyatt doesn't care frankly. > Right. It won't compile on linux, windows, AIX, HP-UX, IRIX, UNICOS, Solaris, VMS, Macintosh, Cray, True-64. It Really only compiles on linux, and really only on my particular machine in my office. > To quote him: "My dual P4 xeon is what counts". > >>for all, my program is very different, for better or worse of course. >>-S. > >Let's hope you don't have the same mistakes. the bad example of crafty is really >that people start writing their own assembly. As if bitboards are fast anyway :)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.