Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Bit board representation

Author: Vincent Diepeveen

Date: 06:02:22 05/30/01

On May 29, 2001 at 20:59:22, Robert Hyatt wrote:

>On May 29, 2001 at 20:17:24, Pham Minh Tri wrote:
>
>>On May 29, 2001 at 10:02:10, Robert Hyatt wrote:
>>
>>>On May 29, 2001 at 03:40:23, Pham Minh Tri wrote:
>>>
>>>>On May 28, 2001 at 01:40:52, Cheok Yan Cheng wrote:
>>>>
>>>>>For a game that used 8x8 board like chess, we can use the bit board
>>>>>representation by using a 64 bit sized variable. However, for the game that used
>>>>>board sized 9x10 like chinese chess, is it possible to use bit board
>>>>>representation that need 90 bit sized variable (where can i get a variable sized
>>>>>90 bits ??)
>>>>>
>>>>>thanks
>>>>
>>>>Chinese chess has not got the "gold number" of 64 like chess, so it is hard to
>>>>use bitboard or some other representations than array. If you insist on using
>>>>bitboard, I suggest that you could use a "mix" representation (like some chess
>>>>programmers did): use bitboard for some pieces (king, pawn, adviser, elephant -
>>>>they are suitable for 64 bit bitboard) and array for the others. However, I do
>>>>not believe bitboard will bring to Chinese chess programmers any benefit or fun.
>>>>
>>>>Pham
>>>
>>>
>>>Remember my earlier comment.  Chess 4.x ran on a 60 bit machine, which means
>>>every bitboard operation had to be spread across two words of memory.  On the
>>>32-bit PC, the bitboard operations are one way to take advantage of the two
>>>integer pipes on the PC.  Other machines have more than two integer pipes that
>>>are very hard to keep busy.  Doing 3 XOR operation is not so bad on those
>>>machines...
>>
>>I do not agree totally with you on the reasons of Slate/Atkin victories (earlier
>>post). IMHO, they might win by other factors, not only bitboard. The factors
>>were probably revolutionary ideas accompanying with bitboard. I know that many
>>ideas and their implementations could occur and become clear in bitboard.  So
>>the first inventors of bitboard would have advantage of two sources of ideas:
>>one from traditional representations, one from their new structure.
>>
>>I think, if two chess programmers are similar (same knowledge, ideas, technique,
>>etc.), the one programming bitboard has not any advantage compare with the other
>>on computers not 64 bit, even though he has to work harder.
>>
>>Pham
>
>
>I don't want to start a holy-war about bitboards vs mailbox programs.  But one
>simple idea.  If you take a program that needs 64 bit words, and run it on a
>64 bit architecture, the 'data density' inside the cpu is really useful since
>it is gating around 64 bit values, and you need 64 bit values, so the bandwidth
>inside the CPU is utilized.  if you take a 32 bit program, and run it on a 64
>bit architecture, 1/2 of every register, 1/2 of every internal bus, 1/2 of
>every internal clock cycle, is wasted.  Because you are pumping 1/2 the data
>internally that the cpu is capable of.
>
>I (or someone working with me) ran a test a couple of years ago... taking a
>normal gnuchess and a normal crafty, and running both on a PC to get a benchmark
>number for each.  Then the same two programs were compiled and run on an older
>alpha, and not a very fast one...  gnu ran right at 2x faster, Crafty was almost
>3.5X faster.

Ugch ugch, you're comparing good ideas that are horrilbe implemented
in gnuchess with assembly optimized code of crafty!

Not very fair.

Also gnuchess is considering pieces to be general, in crafty you have
written out code for *every* piece.

So the compare is impossible. You can design much faster loops and
arrays as gnuchess has, and if one is doing it in assembly it's even
faster.

>That is pretty significant.  Now whether someone can beat my move generator

That's comparing a bicycle which is made to show the advantage of the
wheel with a propellor airplane.

I would rather want to compare the propellor airplane with a Jet engine.

>with an 0x88 (or something better) is a question.  Whether they can beat my
>evaluator using 0x88 is another question.  But it is pretty obvious that on a
>PC they had better beat me by a factor of two, or I will catch up on the

2.2 to be exact.

>64 bit machines.  And then some of the bitboard wizadry comes in handy.  such

Yes i'll beat you on a SUN too, no problem.

But let's ask you how you plan to do mobility on an alpha,
instead of the rude summation you're doing now!

And how you plan to use everywhere in the evaluation attacktable
information using the slow attack function currently in use
for crafty!

Where in my attacktables i can directly see how many attackers are
at a square with one AND instruction of an array within L1 data cache
and a constant value:
  (MyAtt[sq]&0x0ff)

I'm using that loads of times in my evaluation. Also whether some
square is attacked anyway by my opponent:

  OpAtt[sq]

That would be pretty interesting to get fast in bitboards too, but
i can already tell you, IT'S IMPOSSIBLE to do it quick!

Where bitboards are good in are a few things which hardly get used,
like some complex pawn structures:
 if(  (MyPawn&0x0a00000010001000) == 0x0a00000010001000 )

So you can detect for a certain side at the same time several pawns,
which otherwise is slower:
 if(  quickboard[sq_a2] == whitepawn
  && quickboard[sq_b2] == whitepawn
  && quickboard[sq_c2] == whitepawn )

However how many programs are there except mine that use loads
of complex pawnstructure in evaluation?

Now you'll say that you can do other things quick too, like detecting
whether a file is empty and such things. However there are good
alternatives in 32 bits that can be seen as a bitboard too, which
i happen to use in DIEP.

So when in 2010 everyone can buy his own 64 bits machine,
then what i might do is i get a 64 bits machine and
add within 1 week 2 bitboards of 64 bits for pawns to DIEP!

So i'll do the conversion at the time i have such a machine!

To be a factor of 3 slower now on 99% of all computers on the world
(32 bits cpu's) that's not my favourite thing to do!

Right now i do manage with some 32 bits equivalents...

The only 64 bits values i now have in diep are getting used for
node counts... :)

>as very simple tests to answer questions like "here is a bitmap of my passed
>pawns, do I have an outside passer on one side, or do I have an outside passer
>on both sides of the board?"  Ditto for "here is a bitmap of my candidate
>passers, ..."  Right now, on the PC, those operations are pretty expensive

As i mentionned this can be done very fast with good alternatives of
32 bits which are in fact FASTER as 64 bits alternatives.

That is, at 32 bits machines!

>and dig into the advantage of bitmap evaluations.  But on the 64 bit machines,
>those operations lose _all_ of their penalties, and begin to look pretty good.

Diep suffered bigtime when i converted all my 8 bits stuff to 32 bits,
because it all occupied more space
  - more code size
  - more data size

Diep became about 5% slower, and i need to mention that i didn't optimal
profit from 8 bits datastructures. The real penalty on P3 processors would
be around 10%. At K7 about 50%.

So for 32 bits to 64 bits *everything* that's getting 64 bits which first
was 32 bits occupies more size. So i start losing like 10% to *start* with...

Unless there are very GOOD reasons to get to 64 bits in the future i'll stick
to 32 bits for quite a long time with at least 99% of the code!

As 99% of all applications will be 32 bits hell sure processors will
take care they are very fast with 32 bits code, so no problems there either!

there are quite some problems to convert windows applications to 64 bits
because 'int' in general is seen as a 32 bits thing!

>I wouldn't suggest that anyone rewrite their working code.  But I made the
>decision to do this several years ago.  I haven't found it a disadvantage on
>32 bit machines, and I really doubt it will be a disadvantage on 64 bit
>machines.

factor 2.5 at 32 bits machines for sure.

>you have to think "data density".  Which is what made our move generator and
>some evaluations on the Cray so very fast.  But not until you start "thinking
>outside the box" and asking "how can this architecture help me do things more
>efficiently?" rather than "How can I make my program run on this architecture?"

>Those are two vastly different questions.  With vastly different answers.

Here we all agree, but for me putting things in general in a bitboard
means i have 1 bit worth of information about something. That's too little
information for me!

So if i ever go use 64 bits then it's for those parts of my program where
i can profit.

What i still do not understand is why crafty is so slow on 64 bits sun
machine. Everything is 64 bits there. For me a sun processor performs
the same as a PII processor at the same speed.

How is crafty performing at it?

IMHO the only reason why crafty isn't so slow on intel processors is
because you have way less branches as i have in DIEP.

I am using complex patterns in DIEP's evaluation function, so everywhere
there are branches. In crafty the branches that are there are easier
to predict.

So at 64 bits processors where the branch misprediction penalty isn't
big, there crafty will perform very bad compared to DIEP.

At an alpha the branch misprediction penalty is HUGE.

So at a 21164 processor which had only 8kb L1 cache i did very bad with
DIEP. At a 4 processor 21264 processor DIEP probably is doing worse as on
a dual K7 1.5Ghz palomino which gets released next month hopefully.
Operating at 1.5Ghz.

Now the branch misprediction penalty of the K7 isn't very good. But
the 21264 it's way worse. What 21264 has to prevent mispredictions
a bit is a clever system of 2 killertables that reference each other.

However the size of those tables is so small compared to the number of
branches in DIEP, that it won't speed me up a single %!

A 633Mhz 21164 processor performed for DIEP as a 380Mhz PII.

Now that was some time ago of course. Nowadays DIEP would do worse on
that 21164 when compared to the PII, because code+data size has become
bigger.

A similar thing is what i have in mind for the 21264. Where on paper
it can do very good because it does 4 instructions a clock, it's very
likely that a K7 of same Mhz is completely outgunning it for me.

For crafty the 21264 is not so bad as it is for me,
because you relatively suffer less
from the HUGE branch misprediction
penalty as i do with DIEP!

It's not that the 64 bits datastructure is so good at 64 bits, the
problem is that the 21264 design has a serious problem with branches!

Best regards,
Vincent

Best regards,
Vincent

Re: Bit board representation (more info) Robert Hyatt 07:49:22 05/30/01
- Re: Bit board representation (more info) Vincent Diepeveen 03:06:41 05/31/01
  - Re: Bit board representation (more info) Robert Hyatt 07:57:20 05/31/01
Re: Bit board representation Robert Hyatt 07:40:57 05/30/01
- Re: Bit board CHALLENGE - other things answerred and a challenge Vincent Diepeveen 04:34:00 05/31/01
  - Re: Bit board CHALLENGE - other things answerred and a challenge Robert Hyatt 08:06:12 05/31/01
  - Re: Bit board CHALLENGE - other things answerred and a challenge Pham Minh Tri 06:34:17 05/31/01
  - Re: Bit board CHALLENGE - other things answerred and a challenge Brian Richardson 06:17:12 05/31/01
Re: Bit board representation Robert Hyatt 07:39:13 05/30/01
- Re: Bit board representation Bas Hamstra 07:47:27 05/31/01
  - Re: Bit board representation Robert Hyatt 14:01:44 05/31/01
    - Re: Bit board representation Bas Hamstra 15:49:16 05/31/01
      - Re: Bit board representation Carlos del Cacho 10:45:01 06/01/01
      - Re: Bit board representation Robert Hyatt 18:25:51 05/31/01

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.