Author: Vincent Diepeveen
Date: 20:02:52 08/07/03
Go up one level in this thread
On August 07, 2003 at 09:20:35, Sune Fischer wrote: >On August 07, 2003 at 08:46:43, Vincent Diepeveen wrote: > >>>You seem to have missed one crucial point. >>>Crafty is 64 bit prog, which means it's slow on 32 bit, even I have found that >>>doing a lookup is faster than shifting, I simply never do 1<<sq, I use a table >> >>that's 33% at most. Just look to what the alpha 21264c scores versus similar >>architecture K7. 33% difference about. > >Actually it's more like 300%, but since it's only a fraction of the program >overall it could be 33%. That would be just a guess though. > >>Itanium is a new generation complex cpu. too complex for bitboarders and it's >>latency to main memory isn't very impressive which is bad luck for you. > >Not really, I could never afford a Itanium so it's completely irrelevant for me. > >>what you need is fast ram. >> >>Ever measured the difference between RAM speeds for your thing Sune? >> >>You should. > >No I don't care about these artificial tests, you need to do realitic >measurements, not run things in small tight loops, it's flawed. > >I remember you once suggested to run a small cache efficent pawn table, I tried >that - it ran slower!! > >Who cares if some theoretical argument or some artificial test shows it's >faster, when reality shows it's just slower. > >>So measure LATENCY differences. If a machine X has 220 ns latency versus some >>other machine has 400 ns latency. Just measure what it speeds up for you. > >Don't think too much, just do the real tests. You can't figure out the result >anyway, too many factors makes the equation is too complex. > >>>for that. Little things like that are all over the program, when I remove this >>>and go pure 64 bit I do think a factor 2 clock for clock is reachable. >> >>no. perhaps 33% for just going from 32 to 64 bits. You're really underestimating >>how fast the overhead runs at the K7 here. > >Pointless to argue this as we're just guessing both of us, but we'll see >eventually. > >>In those instructions there is very little branch mis predictions little >>register stalls etc. It's all just a few more instructions code that 64 bits at >>32 bits processors. > >Right, I can hardly see an advantage for the Opteron in running 64 bit code, >LOL. > >>the datastructure itself however is a slow thing when compared to non-bitboard. >>that's however a different discussion. > >How is it slow? >What kind of nps do you get relative to Crafty? >You have the slowest of them all AFAIK, so I don't know why you keep mouthing >off like this. i generate moves 2.2 times faster than crafty at 32 bits processors to just name one. At that itanium2 you care shit about, i'm still generating moves faster than crafty. note that itanium as far as i'm told has no direct BSF/BSR instruction, like a real 64 bits cpu has no need for anyway. that opteron has one is just a service to hyatt which they should not make. Bob hasn't bought one yet AFAIK. didn't read his last postings here. no time for that. oh about that beloved itanium2 you dislike. here is diep running a short 16 cpu batch at it: vdiep 18394 99.6 1472.8 3979280 3961408 ? R 00:54 22:26 vdiep 18395 99.3 1467.8 3962752 3948048 ? R 00:54 22:20 vdiep 18396 99.9 1467.8 3962688 3948000 ? R 00:54 22:29 vdiep 18397 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18398 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18399 99.9 1467.8 3962688 3948000 ? R 00:54 22:29 vdiep 18400 99.9 1467.8 3962688 3948000 ? R 00:54 22:29 vdiep 18401 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18402 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18403 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18404 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18405 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18406 99.8 1467.8 3962688 3948000 ? R 00:54 22:28 vdiep 18407 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18408 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18409 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 vdiep 18412 0.0 1472.8 3979280 3961408 ? S 00:54 0:00 vdiep 18413 0.0 1472.8 3979280 3961408 ? S 00:54 0:00 >>>>See for crafty specint: >>> >>>After I saw they tested with 32 bit binaries, I'm not prepared to give them much >>>credit. >>>Frankly I want Eugene or Hyatt to produce the binary, needs to be done right or >>>you lose 30% real quick. The pure C version is a lowest common denominator >>>compile, it sucks basicly. >> >>Hyatt i wouldn't trust producing a textfile with speedup numbers even, but aside >>from that yes Eugene probably has some cool executables from crafty. >> >>>I also want to see other bitboard progs, I'm not sure Crafty is representative >> >>crafty is very poor example of programming: >> - inline assembly everywhere > >For speed no doubt. > >> - no nice loops but all written out black & white even > >For speed no doubt. > >> - every piece written out > >For speed no doubt. > >> - it doesn't compile very well with gcc or visual c++ thanks to >> all of that hacking and hyatt doesn't care frankly. > >It's a minor problem with the egtb.cpp file as far as I know, just a small fix >to the makefile. > >> To quote him: "My dual P4 xeon is what counts". > >Because that is what he develops on. >I don't care about P4 xeons because I don't have access to one and because they >are unaffordable for me. >Is that a mistake? > >>>for all, my program is very different, for better or worse of course. >>>-S. >> >>Let's hope you don't have the same mistakes. the bad example of crafty is really >>that people start writing their own assembly. As if bitboards are fast anyway :) > >I can ask 64 question in parallel in one AND, how many clocks does that take >you? :) > >-S.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.