Author: Anthony Cozzie
Date: 04:56:32 06/24/04
Go up one level in this thread
On June 23, 2004 at 18:30:57, Russell Reagan wrote: >I am curious about a few design choices in Crafty. Some of the questions are >general questions that anyone who is knowledgable could answer, so I ask this >publicly instead of via email to Dr. Hyatt. > >First issue. Crafty is written with a lot of color-specific code, like: > >if (wtm) { > // ... >} >else { > // ... >} > >This is what I think. The downside to this approach is that you double the size >of the code, which is not cache friendly. This should be a bigger problem in a >rotated bitboard engine where there are already cache issues. Bob's code looks like this in assembly: if(wtm) { xor eax+0x334, ebx //take memory at eax+0x334 and xor it with ebx ... Your code would look like xor eax + 8*ecx + 0x334, ebx ^--- Index register Register pressure is one of the biggest problems with bitboards on x86. Just having 1 bitboard in registers requires almost 1/3 of the registers. This is one of the reasons opteron kicks so much ass. >Another potential downside is that you introduce an extra branch. However, I >think that this branch would be easily predictable, since the side to move will >alternate back and forth, and so the branch decision will alternate. Actually it won't, but I'll let you think about that :) >So it seems that as long as you don't overuse this method and ruin the cache, >this is a good method as far as speed is concerned. However, I think it is nicer >to have a single function that works for both colors. It is less to maintain, >and less error prone. > >Second issue. Crafty uses a lot of switch statements, using special code for >each case, increasing the code size. Same issues. Could be bad on cache, harder >to maintain. Crafty uses a lot of switch statements to determine the type of a >piece and update the appropriate bitboard. What about having an array of >bitboards, indexed by the type of the piece? There are no branches involved, and >the code is much smaller. Switch is _fast_. Again, in assembly: and eax, 0xF jmp table[eax] Bitboard engines are not as cache limited as you think. Zappa fits completely in L2 on the opteron (code + tables). Modern processors are so complex that it is very difficult to know what will make your code faster. All you can do is try it and run it, and use a profiler to guide you to the slower places. >I'd like to know what people think about these design choices. Obviously they >work well for Dr. Hyatt, but I wonder if alternatives would be better choices >that would be faster or less error prone. > >Thanks, >Russell
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.