Author: Anthony Cozzie
Date: 04:56:32 06/24/04
Go up one level in this thread
On June 23, 2004 at 18:30:57, Russell Reagan wrote:
>I am curious about a few design choices in Crafty. Some of the questions are
>general questions that anyone who is knowledgable could answer, so I ask this
>publicly instead of via email to Dr. Hyatt.
>
>First issue. Crafty is written with a lot of color-specific code, like:
>
>if (wtm) {
> // ...
>}
>else {
> // ...
>}
>
>This is what I think. The downside to this approach is that you double the size
>of the code, which is not cache friendly. This should be a bigger problem in a
>rotated bitboard engine where there are already cache issues.
Bob's code looks like this in assembly:
if(wtm)
{
xor eax+0x334, ebx //take memory at eax+0x334 and xor it with ebx
...
Your code would look like
xor eax + 8*ecx + 0x334, ebx
^--- Index register
Register pressure is one of the biggest problems with bitboards on x86. Just
having 1 bitboard in registers requires almost 1/3 of the registers. This is
one of the reasons opteron kicks so much ass.
>Another potential downside is that you introduce an extra branch. However, I
>think that this branch would be easily predictable, since the side to move will
>alternate back and forth, and so the branch decision will alternate.
Actually it won't, but I'll let you think about that :)
>So it seems that as long as you don't overuse this method and ruin the cache,
>this is a good method as far as speed is concerned. However, I think it is nicer
>to have a single function that works for both colors. It is less to maintain,
>and less error prone.
>
>Second issue. Crafty uses a lot of switch statements, using special code for
>each case, increasing the code size. Same issues. Could be bad on cache, harder
>to maintain. Crafty uses a lot of switch statements to determine the type of a
>piece and update the appropriate bitboard. What about having an array of
>bitboards, indexed by the type of the piece? There are no branches involved, and
>the code is much smaller.
Switch is _fast_. Again, in assembly:
and eax, 0xF
jmp table[eax]
Bitboard engines are not as cache limited as you think. Zappa fits completely
in L2 on the opteron (code + tables).
Modern processors are so complex that it is very difficult to know what will
make your code faster. All you can do is try it and run it, and use a profiler
to guide you to the slower places.
>I'd like to know what people think about these design choices. Obviously they
>work well for Dr. Hyatt, but I wonder if alternatives would be better choices
>that would be faster or less error prone.
>
>Thanks,
>Russell
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.