Author: Robert Hyatt
Date: 20:37:18 08/08/03
Go up one level in this thread
On August 07, 2003 at 23:02:52, Vincent Diepeveen wrote: >On August 07, 2003 at 09:20:35, Sune Fischer wrote: > >>On August 07, 2003 at 08:46:43, Vincent Diepeveen wrote: >> >>>>You seem to have missed one crucial point. >>>>Crafty is 64 bit prog, which means it's slow on 32 bit, even I have found that >>>>doing a lookup is faster than shifting, I simply never do 1<<sq, I use a table >>> >>>that's 33% at most. Just look to what the alpha 21264c scores versus similar >>>architecture K7. 33% difference about. >> >>Actually it's more like 300%, but since it's only a fraction of the program >>overall it could be 33%. That would be just a guess though. >> >>>Itanium is a new generation complex cpu. too complex for bitboarders and it's >>>latency to main memory isn't very impressive which is bad luck for you. >> >>Not really, I could never afford a Itanium so it's completely irrelevant for me. >> >>>what you need is fast ram. >>> >>>Ever measured the difference between RAM speeds for your thing Sune? >>> >>>You should. >> >>No I don't care about these artificial tests, you need to do realitic >>measurements, not run things in small tight loops, it's flawed. >> >>I remember you once suggested to run a small cache efficent pawn table, I tried >>that - it ran slower!! >> >>Who cares if some theoretical argument or some artificial test shows it's >>faster, when reality shows it's just slower. >> >>>So measure LATENCY differences. If a machine X has 220 ns latency versus some >>>other machine has 400 ns latency. Just measure what it speeds up for you. >> >>Don't think too much, just do the real tests. You can't figure out the result >>anyway, too many factors makes the equation is too complex. >> >>>>for that. Little things like that are all over the program, when I remove this >>>>and go pure 64 bit I do think a factor 2 clock for clock is reachable. >>> >>>no. perhaps 33% for just going from 32 to 64 bits. You're really underestimating >>>how fast the overhead runs at the K7 here. >> >>Pointless to argue this as we're just guessing both of us, but we'll see >>eventually. >> >>>In those instructions there is very little branch mis predictions little >>>register stalls etc. It's all just a few more instructions code that 64 bits at >>>32 bits processors. >> >>Right, I can hardly see an advantage for the Opteron in running 64 bit code, >>LOL. >> >>>the datastructure itself however is a slow thing when compared to non-bitboard. >>>that's however a different discussion. >> >>How is it slow? >>What kind of nps do you get relative to Crafty? >>You have the slowest of them all AFAIK, so I don't know why you keep mouthing >>off like this. > >i generate moves 2.2 times faster than crafty at 32 bits processors to just name >one. OK, the next time we enter a move-generation contest, you will win. However, chess is more than just generating moves... It is way less than 10% of my total execution time, So long as that is true, you can whip me at move generation all you want, but when the day is done, and the _game_ is over, the result won't be determined by how fast you can generate moves. There's more to the game than that. > >At that itanium2 you care shit about, i'm still generating moves faster >than crafty. > >note that itanium as far as i'm told has no direct BSF/BSR instruction, like a >real 64 bits cpu has no need for anyway. that opteron has one is just a service >to hyatt which they should not make. Bob hasn't bought one yet AFAIK. You are an absolute idiot. The Cray has a similar set of instructions, _not_ because of "bob" but because of the crypto guys that demanded it. Get real. Get a clue. > >didn't read his last postings here. no time for that. Of course not, anything that contradicts your mad ramblings is "junk". From Me. From Nalimov. Etc. > >oh about that beloved itanium2 you dislike. here is diep running a short 16 cpu >batch at it: > >vdiep 18394 99.6 1472.8 3979280 3961408 ? R 00:54 22:26 >vdiep 18395 99.3 1467.8 3962752 3948048 ? R 00:54 22:20 >vdiep 18396 99.9 1467.8 3962688 3948000 ? R 00:54 22:29 >vdiep 18397 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18398 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18399 99.9 1467.8 3962688 3948000 ? R 00:54 22:29 >vdiep 18400 99.9 1467.8 3962688 3948000 ? R 00:54 22:29 >vdiep 18401 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18402 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18403 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18404 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18405 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18406 99.8 1467.8 3962688 3948000 ? R 00:54 22:28 >vdiep 18407 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18408 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18409 99.9 1467.8 3962752 3948048 ? R 00:54 22:29 >vdiep 18412 0.0 1472.8 3979280 3961408 ? S 00:54 0:00 >vdiep 18413 0.0 1472.8 3979280 3961408 ? S 00:54 0:00 > > > >>>>>See for crafty specint: >>>> >>>>After I saw they tested with 32 bit binaries, I'm not prepared to give them much >>>>credit. >>>>Frankly I want Eugene or Hyatt to produce the binary, needs to be done right or >>>>you lose 30% real quick. The pure C version is a lowest common denominator >>>>compile, it sucks basicly. >>> >>>Hyatt i wouldn't trust producing a textfile with speedup numbers even, but aside >>>from that yes Eugene probably has some cool executables from crafty. >>> >>>>I also want to see other bitboard progs, I'm not sure Crafty is representative >>> >>>crafty is very poor example of programming: >>> - inline assembly everywhere >> >>For speed no doubt. >> >>> - no nice loops but all written out black & white even >> >>For speed no doubt. >> >>> - every piece written out >> >>For speed no doubt. >> >>> - it doesn't compile very well with gcc or visual c++ thanks to >>> all of that hacking and hyatt doesn't care frankly. >> >>It's a minor problem with the egtb.cpp file as far as I know, just a small fix >>to the makefile. >> >>> To quote him: "My dual P4 xeon is what counts". >> >>Because that is what he develops on. >>I don't care about P4 xeons because I don't have access to one and because they >>are unaffordable for me. >>Is that a mistake? >> >>>>for all, my program is very different, for better or worse of course. >>>>-S. >>> >>>Let's hope you don't have the same mistakes. the bad example of crafty is really >>>that people start writing their own assembly. As if bitboards are fast anyway :) >> >>I can ask 64 question in parallel in one AND, how many clocks does that take >>you? :) >> >>-S.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.