Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Design choices in Crafty

Author: Anthony Cozzie

Date: 04:56:32 06/24/04

Go up one level in this thread


On June 23, 2004 at 18:30:57, Russell Reagan wrote:

>I am curious about a few design choices in Crafty. Some of the questions are
>general questions that anyone who is knowledgable could answer, so I ask this
>publicly instead of via email to Dr. Hyatt.
>
>First issue. Crafty is written with a lot of color-specific code, like:
>
>if (wtm) {
>    // ...
>}
>else {
>    // ...
>}
>
>This is what I think. The downside to this approach is that you double the size
>of the code, which is not cache friendly. This should be a bigger problem in a
>rotated bitboard engine where there are already cache issues.

Bob's code looks like this in assembly:

if(wtm)
{
    xor eax+0x334, ebx  //take memory at eax+0x334 and xor it with ebx
    ...

Your code would look like

    xor eax + 8*ecx + 0x334, ebx
                 ^---  Index register

Register pressure is one of the biggest problems with bitboards on x86.  Just
having 1 bitboard in registers requires almost 1/3 of the registers.  This is
one of the reasons opteron kicks so much ass.

>Another potential downside is that you introduce an extra branch. However, I
>think that this branch would be easily predictable, since the side to move will
>alternate back and forth, and so the branch decision will alternate.

Actually it won't, but I'll let you think about that :)

>So it seems that as long as you don't overuse this method and ruin the cache,
>this is a good method as far as speed is concerned. However, I think it is nicer
>to have a single function that works for both colors. It is less to maintain,
>and less error prone.
>
>Second issue. Crafty uses a lot of switch statements, using special code for
>each case, increasing the code size. Same issues. Could be bad on cache, harder
>to maintain. Crafty uses a lot of switch statements to determine the type of a
>piece and update the appropriate bitboard. What about having an array of
>bitboards, indexed by the type of the piece? There are no branches involved, and
>the code is much smaller.

Switch is _fast_.  Again, in assembly:

  and eax, 0xF
  jmp table[eax]

Bitboard engines are not as cache limited as you think.  Zappa fits completely
in L2 on the opteron (code + tables).

Modern processors are so complex that it is very difficult to know what will
make your code faster.  All you can do is try it and run it, and use a profiler
to guide you to the slower places.

>I'd like to know what people think about these design choices. Obviously they
>work well for Dr. Hyatt, but I wonder if alternatives would be better choices
>that would be faster or less error prone.
>
>Thanks,
>Russell



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.