Author: Russell Reagan
Date: 16:32:27 09/25/04
Go up one level in this thread
On September 25, 2004 at 06:21:13, martin fierz wrote: >1) does anybody here have an idea how these 3 ways of generating attacks for >sliders compare speedwise for both 32-bit and 64-bit processors? (reason: if >kogge-stone will be comparable to rotated bitboards, i'd rather not bother to >implement rotated bitboards, and if both are not much better than my current >code, i'd also rather not bother with a rewrite). I'm going to ramble here, so just ask for clarification when I quit making sense :-) I am not sure there is one clear answer. I think which one works best for you will depend how you use each of them. For instance, assume you implement rotated bitboard attack generation functions such as these: typedef unsigned long long BitBoard; BitBoard RookAttacks (int square); Bitboard BishopAttacks (int square); // etc. Then you try to replace those functions with Kogge-Stone versions. The Kogge-Stone version will run quite a bit slower if you just "drop in" the Kogge-Stone attack generation to replace rotated bitboard attack generation. Just as "thinking bitboards" will require you to handle things differently than an 0x88 board representation, there are some differences in the way you "think rotated bitboards" and "think Kogge-Stone bitboards". I tried to use Kogge-Stone attack generation in a similar way to rotated bitboards, and the Kogge-Stone version was about half the speed of Crafty at calculating perft on a 32-bit machine (after much help from Dann Corbit). However, I think if I started over from scratch using the Kogge-Stone stuff, I could produce something better, because I would be able to "think Kogge-Stone" instead of "thinking rotated bitboards". Kogge-Stone is, IMO, a further extreme on the bitboard spectrum. For instance, even the fastest rotated bitboard implementations are not faster than the fastest 0x88-style implementations for raw movegen/make/undo, but bitboards allow you to express more complex evaluation ideas more efficiently than an 0x88-style implementation (however, it is debatable whether such complex ideas are necessary). So overall rotated bitboards tend to be about even with 0x88-style approaches on 32-bit hardware. I think Kogge-Stone may be an amplification of that same concept. Even slower at movegen/make/undo, but offering more expressive ability in evaluation. Gerd has shown that if you are willing to dive into writing your own MMX assembly (making use of the 64-bit MMX registers on 32-bit machines), that Kogge-Stone can run at the same speed as rotated bitboard implementations. I think that if you used Kogge-Stone correctly, you could get pretty close to rotated bitboard speed without using assembly. Jumping to 64-bit hardware may be even more beneficial to Kogge-Stone than rotated bitboards. I think the main problem that I (and Dann) had with getting the Kogge-Stone stuff to go faster was that the attack generation routines were destructive to the registers (another reason why KoggeStone may benefit from 64-bit hardware more than rotated bitboards: more registers, more data per register). So when you are doing some task (say, generating moves) and then you generate the attacks using Kogge-Stone, you destroy whatever was in the registers. Rotated bitboard attack generation isn't quite so destructive. Therefore, one thing you might try is to do the KoggeStone stuff all at once when you can. For example, some pseudocode: 1. Move generation for rooks and queens using KoggeStone get bitboard of rooks and queens generate upward attacks for rooks and queens (destroys register contents) generate moves from upward attacks generate downward attacks for rooks and queens (destroys register contents) generate moves from downward attacks // handle the other directions 2. Move generation for rooks and queens using rotated bitboards get bitboard of rooks and queens for each occupied square generate attacks from that square generate moves from attacks The KoggeStone version is like driving down the road and you have to stop at a stop sign at every intersection. Go, stop, go, stop... slow. It would be better if you could do something like this. generate upward attacks (destroys register contents) generate downward attacks (destroys register contents) generate left attacks (destroys register contents) generate right attacks (destroys register contents) generate moves from upward attacks generate moves from downward attacks generate moves from left attacks generate moves from right attacks That way you're still trashing the registers, but most of the time there was trash there to begin with so you don't care. Anyway, that is one example of how you have to change your thinking. You can't just replace rotated bitboard attack generation with KoggeStone and expect it to be the same speed. If move generation is too slow using KoggeStone, you could always use a mailbox approach. As far as not wanting to implement rotated bitboards, I don't think it is quite as scary as you think. Just think of the rotated bitboards as a way to get 8-bit (or less) rank, file, and diagonal states which you can use as indexes into lookup tables. Whatever you want to store in the lookup tables is up to you (attacks is the most common example). You can actually make your lookup tables for slider attacks quite small if you want. Maybe 5-6KB or less. I can think of a way to use the reversed bitboard trick (x^x-2) to use about 4KB of lookup tables to generate slider attacks. I think my understanding of how to use bitboards has benefited greatly from implementing various bitboard approaches. I would highly recommend it. >3) the ray-tracing method is probably slower, but one of the benefits is that >you can continue ray-tracing once you hit an occupied square, so you can get >pins or x-ray-attacks. i don't see how to get such attacks easily in rotated >bitboard or kogge-stone formalism. am i missing something? With KoggeStone it's easy. Something like this, depending upon how you have written your routines. attacks = GenerateUpwardAttacks(rooks | queens, empty_squares); xray_attacks = GenerateUpwardAttacks(attacks, empty_squares) & ~attacks;
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.