Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: kogge-stone, rotated bitboards, "raytracing" - on 32 or 64-bit platforms

Author: Russell Reagan

Date: 16:32:27 09/25/04

Go up one level in this thread


On September 25, 2004 at 06:21:13, martin fierz wrote:

>1) does anybody here have an idea how these 3 ways of generating attacks for
>sliders compare speedwise for both 32-bit and 64-bit processors? (reason: if
>kogge-stone will be comparable to rotated bitboards, i'd rather not bother to
>implement rotated bitboards, and if both are not much better than my current
>code, i'd also rather not bother with a rewrite).


I'm going to ramble here, so just ask for clarification when I quit making sense
:-)

I am not sure there is one clear answer. I think which one works best for you
will depend how you use each of them. For instance, assume you implement rotated
bitboard attack generation functions such as these:

typedef unsigned long long BitBoard;

BitBoard RookAttacks (int square);
Bitboard BishopAttacks (int square);
// etc.

Then you try to replace those functions with Kogge-Stone versions. The
Kogge-Stone version will run quite a bit slower if you just "drop in" the
Kogge-Stone attack generation to replace rotated bitboard attack generation.
Just as "thinking bitboards" will require you to handle things differently than
an 0x88 board representation, there are some differences in the way you "think
rotated bitboards" and "think Kogge-Stone bitboards".

I tried to use Kogge-Stone attack generation in a similar way to rotated
bitboards, and the Kogge-Stone version was about half the speed of Crafty at
calculating perft on a 32-bit machine (after much help from Dann Corbit).
However, I think if I started over from scratch using the Kogge-Stone stuff, I
could produce something better, because I would be able to "think Kogge-Stone"
instead of "thinking rotated bitboards".

Kogge-Stone is, IMO, a further extreme on the bitboard spectrum. For instance,
even the fastest rotated bitboard implementations are not faster than the
fastest 0x88-style implementations for raw movegen/make/undo, but bitboards
allow you to express more complex evaluation ideas more efficiently than an
0x88-style implementation (however, it is debatable whether such complex ideas
are necessary). So overall rotated bitboards tend to be about even with
0x88-style approaches on 32-bit hardware. I think Kogge-Stone may be an
amplification of that same concept. Even slower at movegen/make/undo, but
offering more expressive ability in evaluation.

Gerd has shown that if you are willing to dive into writing your own MMX
assembly (making use of the 64-bit MMX registers on 32-bit machines), that
Kogge-Stone can run at the same speed as rotated bitboard implementations. I
think that if you used Kogge-Stone correctly, you could get pretty close to
rotated bitboard speed without using assembly. Jumping to 64-bit hardware may be
even more beneficial to Kogge-Stone than rotated bitboards.

I think the main problem that I (and Dann) had with getting the Kogge-Stone
stuff to go faster was that the attack generation routines were destructive to
the registers (another reason why KoggeStone may benefit from 64-bit hardware
more than rotated bitboards: more registers, more data per register). So when
you are doing some task (say, generating moves) and then you generate the
attacks using Kogge-Stone, you destroy whatever was in the registers. Rotated
bitboard attack generation isn't quite so destructive. Therefore, one thing you
might try is to do the KoggeStone stuff all at once when you can.

For example, some pseudocode:

1. Move generation for rooks and queens using KoggeStone

get bitboard of rooks and queens
  generate upward attacks for rooks and queens (destroys register contents)
  generate moves from upward attacks
  generate downward attacks for rooks and queens (destroys register contents)
  generate moves from downward attacks
  // handle the other directions

2. Move generation for rooks and queens using rotated bitboards

get bitboard of rooks and queens
for each occupied square
  generate attacks from that square
  generate moves from attacks

The KoggeStone version is like driving down the road and you have to stop at a
stop sign at every intersection. Go, stop, go, stop... slow. It would be better
if you could do something like this.

generate upward attacks (destroys register contents)
generate downward attacks (destroys register contents)
generate left attacks (destroys register contents)
generate right attacks (destroys register contents)
generate moves from upward attacks
generate moves from downward attacks
generate moves from left attacks
generate moves from right attacks

That way you're still trashing the registers, but most of the time there was
trash there to begin with so you don't care.

Anyway, that is one example of how you have to change your thinking. You can't
just replace rotated bitboard attack generation with KoggeStone and expect it to
be the same speed.

If move generation is too slow using KoggeStone, you could always use a mailbox
approach.

As far as not wanting to implement rotated bitboards, I don't think it is quite
as scary as you think. Just think of the rotated bitboards as a way to get 8-bit
(or less) rank, file, and diagonal states which you can use as indexes into
lookup tables. Whatever you want to store in the lookup tables is up to you
(attacks is the most common example). You can actually make your lookup tables
for slider attacks quite small if you want. Maybe 5-6KB or less.

I can think of a way to use the reversed bitboard trick (x^x-2) to use about 4KB
of lookup tables to generate slider attacks.

I think my understanding of how to use bitboards has benefited greatly from
implementing various bitboard approaches. I would highly recommend it.


>3) the ray-tracing method is probably slower, but one of the benefits is that
>you can continue ray-tracing once you hit an occupied square, so you can get
>pins or x-ray-attacks. i don't see how to get such attacks easily in rotated
>bitboard or kogge-stone formalism. am i missing something?


With KoggeStone it's easy. Something like this, depending upon how you have
written your routines.

attacks = GenerateUpwardAttacks(rooks | queens, empty_squares);
xray_attacks = GenerateUpwardAttacks(attacks, empty_squares) & ~attacks;



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.