Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Bitboard Serialization

Author: Anthony Cozzie

Date: 15:07:10 12/06/02

Go up one level in this thread


>>What I am going to try next (after I finish my 5 projects due by thursday) is
>>doing two of these move generations in parallel, that is doing the
>>bitboard->movelist for two knights in parallel, etc.
>>
>>anthony
>
>Hi Anthony,
>
>That's interesting, Walter's routine is really great.

I just realized my post made it sound like walter's routine was at fault.  This
is the just the benchmark of the loop unrolled version vs. the non unrolled
version, which just looks like:

while(bitboard != 0)
     add_to_buffer(lsb_walter(bitboard));


>
>Yes the problems are the sparely populated bitboards.

This is why I'm suggesting we look for parallelism in the move generator in a
different area, overlapping two of the "bitboard->movelist" functions.

>
>The mmx-stuff makes high- and low-board in parallel due to SIMD-instructions.
>The Athlon(XP?) makes up to four register independent mmx-instruction
>simultaniously. Therefor step 4.) is really required.
>
>Another idea is to do some BxQ and BxR in parallel without any conditional
>jumps, except an initial "if ( bb != 0 )", or may be not even that.
>
>Similar SSE2-instructions are also available for P4 and Hammer of course.
>But Athlon is about twice as fast as P4 with this independent mmx-stuff.
>
>btw. maximum iteration count on a halfboard per piece:
>            (captures)
>Rook    10  (4)
>Bishop   7  (4)
>Queen   17  (8)
>Knight   6  (6)
>King     8  (8!)
>
>Regards,
>Gerd





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.