Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Differences between 0x88 ,10x12 and Bitboards!?

Author: Robert Hyatt
Date: 13:18:02 11/19/02
On November 19, 2002 at 14:56:11, Vincent Diepeveen wrote:

>On November 19, 2002 at 14:25:18, Gerd Isenberg wrote:
>
>>On November 19, 2002 at 13:57:02, Vincent Diepeveen wrote:
>>
>>>On November 19, 2002 at 13:30:57, Christophe Theron wrote:
>>>
>>>>On November 19, 2002 at 13:15:09, Gerd Isenberg wrote:
>>>>
>>>>>On November 19, 2002 at 12:25:11, Gian-Carlo Pascutto wrote:
>>>>>
>>>>>>On November 19, 2002 at 11:35:24, Robert Hyatt wrote:
>>>>>>
>>>>>>>Bitboards have a bit of a performance advantage on 64 bit processors,
>>>>>>
>>>>>>Proof?
>>>>>>
>>>>>>--
>>>>>>GCP
>>>>>
>>>>>Hi Gian-Carlo,
>>>>>
>>>>>I think that's evident. If the none bitboarders have to use implicite native
>>>>>data-width of 64 bit integers, they have to transfer 32 additional zero bits
>>>>>without any additional information for each integer access. Of course you will
>>>>>pack some data, but all the local ints...
>>>>>
>>>>>So the information density for bitboarders grows with 64bit-architectures
>>>>>relative to none bitboarders. That also effects register usage, and that's IMHO
>>>>>more important. On x86-32bit architectures you can only hold three bitboards in
>>>>>registers, and thats even most a hard task. Actually, if you have a local
>>>>>routine with three bitboards and a few ints on the stack, there are a lot
>>>>>register/memory moves. Simply the data-width doubles the number of bitboard
>>>>>registers, not considered the increase in general purpose registers, or with
>>>>>hammer the number of mmx- and 128-bit xmm registers.
>>>>>
>>>>>Whether a bitboard based program is stronger than a none bitboard program
>>>>>depends obviuosly also on other things, but in principle :)
>>>>>
>>>>>Cheers,
>>>>>Gerd
>>>>
>>>>
>>>>
>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.
>>>>
>>>>You have not explained why they are supposed to have "a bit of performance
>>>>advantage on 64 bits processors".
>>>>
>>>>
>>>>
>>>>    Christophe
>>>
>>>Exactly,
>>>
>>>But more important is that they are not in the same league at 32 bits
>>>processors with knowledge. As soon as they need more knowledge they
>>>run into problems. My move generation in itself eats 0.6% of the system
>>>time. My evaluation nearly all of it.
>>>
>>>Yet a crucial aspect of evaluating is scanning and that's something
>>>where bitboards are handicapped with 1 or 2 exceptions (you can
>>>quickly scan a rank or file for presence of a piece X). So at 64
>>>bits CPUs a hybrid model is something which is possible but the
>>>majority of the cases just having 1 bit available with bitboards
>>>and needing an extra
>>>array lookup to index all the different arrays for having the bit
>>>set true, that's pretty slow.
>>>
>>>Yet if you keep your program dumb and fast, you don't suffer from
>>>that lack of course and will never be able to realize it.
>>>
>>>In which case there is no advantage of course, because crafty is
>>>at a McKinley like 1.5MLN nodes a second, whereas a program which
>>>is equally weak tactical last so many plies and in qsearch (saving
>>>out loads of system time which most stronger programs prefer to
>>>spent there), to make it in non-bitboards, is easily going to
>>>get the same number of nps with those blazing fast 6 instruction
>>>bundles a clock.
>>>
>>>In a simple experiment of mine where i just did recaptures based
>>>upon a single simplistic attacktable in qsearch (only generated
>>>that attacktable at root in qsearch not the innernodes in qsearch),
>>>i got at a P5-100 already blazingly fast. 200k nps initially. When
>>>i then tried to search deeper it got down to 100-150k nps, depending
>>>whether i tried to do less in qsearch or more.
>>>
>>>In all these cases bitboards is a major handicap at 32 bits processors
>>>and at 64 bits processors the % you lose to bitboards versus non bitboards
>>>is depending upon non-bitboards code in the search simply.
>>>
>>>Best regards,
>>>Vincent
>>
>>Hi Vincent,
>>
>>May be there are different paradigms and/or thinking schemes. For me specially
>>in eval, which takes about 70% of IsiChess including "mate at the glance",
>>bitboards are great to define tactical and positional patterns. I fear if i
>>switch to another approach, i'll have some difficulties, to define these
>>patterns with other structures.
>>
>>There are algorithms that gain superproportional from 64-bit architectures, like
>>flood-fill and Kogge-Stone. I can't imagine that you are not fascinated by the
>>possibilities of these algos.
>>
>>Regards,
>>Gerd
>
>If you agree with me that the focus in the future is on evaluation,
>it will not come as a surprise that more complex patterns get more
>crucial.
>
>Let me present you a simple example. suppose you are scanning the board
>for example for your king safety, mobility or whatever, for complex
>patterns to evaluate:
>
>We are now at square sq and want to know whether a complex pattern is
>happening. For example:
>
>if( sq >= 8 && board[sq-8] == bishop && board[sq-7] == rook && ...
>
>What's your bitboard equivalent here?

This is trivial and it shows you don't understand bitmaps:

if (WhiteBishops&SetMask(sq-8) && WhiteRooks&SetMask(sq-7) &&

Which is trivial.  The SetMasks are all going to fit in L1.  64 time 8 bytes.
The 12
piece location bitmaps also fit in L1 trivially (12 time 8 bytes).  So I suppose
I fail to
see your point entirely.

>
>snelbord is in the L1 cache of course. Let's not worry about L1 or L2
>caches here. Also let's not worry about that both in bitboards *and*
>non bitboards you can get rid of at least 1 compare here.
>
>AFAIK you need an array for each value sq-8 and sq-7 to lookup the
>mask needed for the bitboard to AND with.

As far as you know doesn't cut it.  Put a 1 in a register.  Shift it left the
right number
of bits to get it to the square you want to test.  No memory references
whatsoever if you
want to do it that way.  For example, some 64 bit processors have an instruction
that does
this in one cycle.  Or you can ask for a mask of either the rightmost N bits or
the leftmost N
bits, again in one cycle...



>
>lemme give a shot out of the hand, writing compatible code
>(so not writing out black & white :
>
>So here the bitboard equivalent without caring that in both
>examples we can save out a CMP:
>
>if( sq >= 8 && (Bishops[side]&Mask[sq-8]) && (Rooks[side]&Mask[sq-7]) && ...

Not bad.  And just as efficient as what you are doing as on a 64 bit machine,
those
values are all one native "word" in length...  which means no penalty.



>
>Best regards,
>Vincent
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.