Author: Robert Hyatt
Date: 17:34:36 09/15/01
Go up one level in this thread
On September 15, 2001 at 14:30:40, Vincent Diepeveen wrote: >On September 15, 2001 at 11:14:36, Sune Fischer wrote: > >>Hi >> >>When I discovered that Crafty had some neat assembler rutines, I decided to test >>if they where faster than my humble rutines. >>I have up until now used a "raytracing" or boardtracing rutine to find the legal >>moves for the sliding pieces. And for the knights, pawns and king I simply check >>if the board has an allied piece on that square. >>It's a simple way of generating the legal moves. >>I still generate the attack bitboards to get the legal king moves though. >> >>I then tried to find all the moves directly from the bitboards by using one of >>Crafty's assember functions. >>This part of my program is only a minor part, but none the less it runs a full 8 >>percent slower than my previous raytracing algorithm!! >> >>This is what I have now: >> >>void FindMovesQueen(..loads of pointers...) >>{ >> int to_square; >> BITBOARD bb=(~allied.occupied) & allied.attack[board.id[from_square]]; >> >> while (bb) >> { >> to_square=63-FirstOne(bb); // get a bit > >the slow part is of course the 'firstone' function, though >it appears there is a trick on the K7 to speed it up quite a >bit in assembly. Nevertheless all the 64 bits overhead in total >slow down the thing. FirstOne() isn't particularly slow, using the bit-scan instructions that are very fast on PII and beyond. It certainly isn't much of a cost in the profile runs I do. > >> bb ^= mask[to_square]; // remove the bit >> >> movelist[++counter].from=from_square; >> movelist[counter].to=to_square; >> movelist[counter].piece=QUEEN; >> movelist[counter].capturedpiece=enemy.piece[board.id[to_square]]; > >this looks ugly. Using a pointer here would be way faster. > >> } >> return; >>} >> >>(movelist, counter, from_square etc. are passed in the argument) >> >>I know why it is slow too. >>First I have to form the bb, that's 2 bitboard operations. >>Then I need to run an algorithm, FirstOne from crafty, to find the first bit. >>Then I need to mask out that found bit. >>Both of these run several time pending on the while loop. >>Next is 4 lines I always have, no matter how I do it, so we can safely ignore >>those. >>The while loop is a conditional much like the if's I use when raytracing, so >>probably the if's and while almost cancel out. >> >>All in all I have added a lot of operations to my program, I am not surprized it >>is a lot slower. Given that the entire program suffer a slowdown by about 8.1 >>percent, I estimate that this technique is more than twice as slow as the >>raytracing. >>The upside of things is that I reduce my code by about 1000 lines or so, but is >>this worth 8 percent in speed? > >Why not speed it up a factor of 2 if not more (depending upon what >processor we talk about) and just get rid of all code and rewrite it >till you have about a line or 50 in total, including EP and castling? > >that's always faster than 1000 lines of C++ code. > >The problem of too big code is that when testing it a few million time >a second it might seem fast, but things change completely when they are >integral part of the program. After some years of programming your program >won't fit inside L1 cache anymore and then the suffering starts! > >Note that the small trace cache size of the P4 gives thoughts on thinking >too here. With DIEP where the ENGINE part takes many hundreds of kilobytes >of codesize from the L1+L2 caches, i definitely always go for the smallest >code. Where smallest code is defined as: > min( (whitecode + blackcode) , (general code) ) > >Sometimes writing things general is taking up more code than tot total >of writing it for black & white out (certain eval patterns for example). > >In case of move generation that isn't the case in DIEP, but the bitboard >code for example already has been written down out for every piece seperate. > >Bitboards are a big disaster here IMHO. > >>I know the use of 64 bit processors will give a nice boost to this method and I >>probably can't expect anything like that for my tracing function, but frankly >>I'm not sure it would be enough to catch up. >> >>Am I mistaken, are the big boys doing something else? >>Is there an even faster way? >>I need to know :D :D >> >>Thanks, >>Sune
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.