Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Bit board representation

Author: Robert Hyatt

Date: 07:39:13 05/30/01

Go up one level in this thread


On May 30, 2001 at 09:02:22, Vincent Diepeveen wrote:

>On May 29, 2001 at 20:59:22, Robert Hyatt wrote:
>
>>On May 29, 2001 at 20:17:24, Pham Minh Tri wrote:
>>
>>>On May 29, 2001 at 10:02:10, Robert Hyatt wrote:
>>>
>>>>On May 29, 2001 at 03:40:23, Pham Minh Tri wrote:
>>>>
>>>>>On May 28, 2001 at 01:40:52, Cheok Yan Cheng wrote:
>>>>>
>>>>>>For a game that used 8x8 board like chess, we can use the bit board
>>>>>>representation by using a 64 bit sized variable. However, for the game that used
>>>>>>board sized 9x10 like chinese chess, is it possible to use bit board
>>>>>>representation that need 90 bit sized variable (where can i get a variable sized
>>>>>>90 bits ??)
>>>>>>
>>>>>>thanks
>>>>>
>>>>>Chinese chess has not got the "gold number" of 64 like chess, so it is hard to
>>>>>use bitboard or some other representations than array. If you insist on using
>>>>>bitboard, I suggest that you could use a "mix" representation (like some chess
>>>>>programmers did): use bitboard for some pieces (king, pawn, adviser, elephant -
>>>>>they are suitable for 64 bit bitboard) and array for the others. However, I do
>>>>>not believe bitboard will bring to Chinese chess programmers any benefit or fun.
>>>>>
>>>>>Pham
>>>>
>>>>
>>>>Remember my earlier comment.  Chess 4.x ran on a 60 bit machine, which means
>>>>every bitboard operation had to be spread across two words of memory.  On the
>>>>32-bit PC, the bitboard operations are one way to take advantage of the two
>>>>integer pipes on the PC.  Other machines have more than two integer pipes that
>>>>are very hard to keep busy.  Doing 3 XOR operation is not so bad on those
>>>>machines...
>>>
>>>I do not agree totally with you on the reasons of Slate/Atkin victories (earlier
>>>post). IMHO, they might win by other factors, not only bitboard. The factors
>>>were probably revolutionary ideas accompanying with bitboard. I know that many
>>>ideas and their implementations could occur and become clear in bitboard.  So
>>>the first inventors of bitboard would have advantage of two sources of ideas:
>>>one from traditional representations, one from their new structure.
>>>
>>>I think, if two chess programmers are similar (same knowledge, ideas, technique,
>>>etc.), the one programming bitboard has not any advantage compare with the other
>>>on computers not 64 bit, even though he has to work harder.
>>>
>>>Pham
>>
>>
>>I don't want to start a holy-war about bitboards vs mailbox programs.  But one
>>simple idea.  If you take a program that needs 64 bit words, and run it on a
>>64 bit architecture, the 'data density' inside the cpu is really useful since
>>it is gating around 64 bit values, and you need 64 bit values, so the bandwidth
>>inside the CPU is utilized.  if you take a 32 bit program, and run it on a 64
>>bit architecture, 1/2 of every register, 1/2 of every internal bus, 1/2 of
>>every internal clock cycle, is wasted.  Because you are pumping 1/2 the data
>>internally that the cpu is capable of.
>>
>>I (or someone working with me) ran a test a couple of years ago... taking a
>>normal gnuchess and a normal crafty, and running both on a PC to get a benchmark
>>number for each.  Then the same two programs were compiled and run on an older
>>alpha, and not a very fast one...  gnu ran right at 2x faster, Crafty was almost
>>3.5X faster.
>
>Ugch ugch, you're comparing good ideas that are horrilbe implemented
>in gnuchess with assembly optimized code of crafty!
>
>Not very fair.

What on earth are you talking about?  When I ran crafty and gnu on the PC,
I _did_ use the best versions available, including my .asm replacement
code.  When I ran them both on the alpha, I had to give up _all_ of my
assembly and use pure C.  I still got the speedup I indicated, without _any_
assembly on the alpha...  Which didn't favor crafty, it actually favored GNU.

Had I compared non-asm crafty on both machines, the difference would have been
even more pronounced.


>
>Also gnuchess is considering pieces to be general, in crafty you have
>written out code for *every* piece.


What does this have to do with anything?  I simply wanted to know "How much
more does crafty get from a 64 bit machine than a 32-bit program?"  This
experiment answered that perfectly for the case of GNU vs Crafty.  I doubt
your speed differential will be any different on the two machines than that
of GNU.



>
>So the compare is impossible. You can design much faster loops and
>arrays as gnuchess has, and if one is doing it in assembly it's even
>faster.


There was no assembly in the alpha version of crafty, so your comment makes
utterly no sense.  Whether GNU is a "good" 32 bit program or not is also
completely irelevant.  The question was how much faster would it run on the
alpha?  The answer was given above.  Ditto for crafty, which _is_ designed
around a 64 bit word.  I'll be happy to comple Diep and compare it to crafty
on my alpha if you want.

The numbers won't be any different.




>
>>That is pretty significant.  Now whether someone can beat my move generator
>
>That's comparing a bicycle which is made to show the advantage of the
>wheel with a propellor airplane.
>
>I would rather want to compare the propellor airplane with a Jet engine.
>
>>with an 0x88 (or something better) is a question.  Whether they can beat my
>>evaluator using 0x88 is another question.  But it is pretty obvious that on a
>>PC they had better beat me by a factor of two, or I will catch up on the
>
>2.2 to be exact.
>
>>64 bit machines.  And then some of the bitboard wizadry comes in handy.  such
>
>Yes i'll beat you on a SUN too, no problem.


Anytime you want me to run the test, feel free to send me your code.  We have
64 bit sparcs here, I have an alpha.  I can probably run the comparison on a
64 bit MIPS box too.  I already know the answer, however.


A 64 bit program simply runs faster when ported to a 64 bit machine, than a
32 bit program does when ported to the same machine.  _every_ time.



>
>But let's ask you how you plan to do mobility on an alpha,
>instead of the rude summation you're doing now!



_SAME_ code as right now.


>
>And how you plan to use everywhere in the evaluation attacktable
>information using the slow attack function currently in use
>for crafty!


First, my attack tables are not slow.  They are simple direct memory lookups.
I use exactly the same code on any architecture.

>
>Where in my attacktables i can directly see how many attackers are
>at a square with one AND instruction of an array within L1 data cache
>and a constant value:
>  (MyAtt[sq]&0x0ff)
>
>I'm using that loads of times in my evaluation. Also whether some
>square is attacked anyway by my opponent:
>
>  OpAtt[sq]


Vincent, I know you have a short attention span.  But please stick to the
topic.  The issue here is "how does a 32 bit program perform on a 32 bit
processor, and then when moved to a 64 bit processor, compared to how does a 64
bit program perform on a 32 bit processor and then on a 64 bit processor?"

You want to change the question.  You can't.  I will happily run my program and
your program on both a 32 bit machine and on a 64 bit machine and compare the
speedups.  Mine will be larger.  Every time.




>
>That would be pretty interesting to get fast in bitboards too, but
>i can already tell you, IT'S IMPOSSIBLE to do it quick!


This from the same person that said "using message passing is impossible to
get a speedup."  Or "It is impossible to win at chess with a quiescence search
that only does limited captures."  Or any of other several "impossible" things.
Later, you learn how to do them of course.





>
>Where bitboards are good in are a few things which hardly get used,
>like some complex pawn structures:
> if(  (MyPawn&0x0a00000010001000) == 0x0a00000010001000 )
>
>So you can detect for a certain side at the same time several pawns,
>which otherwise is slower:
> if(  quickboard[sq_a2] == whitepawn
>  && quickboard[sq_b2] == whitepawn
>  && quickboard[sq_c2] == whitepawn )
>
>However how many programs are there except mine that use loads
>of complex pawnstructure in evaluation?

Mine, for example.  I even compute all the squares all pawns can move to,
and use that in the evaluation.




>
>Now you'll say that you can do other things quick too, like detecting
>whether a file is empty and such things. However there are good
>alternatives in 32 bits that can be seen as a bitboard too, which
>i happen to use in DIEP.



There are other "good" alternatives.  But you are _still_ using 1/2 of an
alpha.  _that_ is the point.



>
>So when in 2010 everyone can buy his own 64 bits machine,
>then what i might do is i get a 64 bits machine and
>add within 1 week 2 bitboards of 64 bits for pawns to DIEP!
>
>So i'll do the conversion at the time i have such a machine!
>
>To be a factor of 3 slower now on 99% of all computers on the world
>(32 bits cpu's) that's not my favourite thing to do!


I'm not a factor of 3 slower. That is your imagination working.  Somethings
are faster in bitmaps, some slower.  You want to harp on the move generation.
That is less than 10% of my execution time, so it doesn't count.  Evaluation
is very good in bitmaps.




>
>Right now i do manage with some 32 bits equivalents...
>
>The only 64 bits values i now have in diep are getting used for
>node counts... :)
>
>>as very simple tests to answer questions like "here is a bitmap of my passed
>>pawns, do I have an outside passer on one side, or do I have an outside passer
>>on both sides of the board?"  Ditto for "here is a bitmap of my candidate
>>passers, ..."  Right now, on the PC, those operations are pretty expensive
>
>As i mentionned this can be done very fast with good alternatives of
>32 bits which are in fact FASTER as 64 bits alternatives.
>
>That is, at 32 bits machines!
>
>>and dig into the advantage of bitmap evaluations.  But on the 64 bit machines,
>>those operations lose _all_ of their penalties, and begin to look pretty good.
>
>Diep suffered bigtime when i converted all my 8 bits stuff to 32 bits,
>because it all occupied more space
>  - more code size
>  - more data size
>
>Diep became about 5% slower, and i need to mention that i didn't optimal
>profit from 8 bits datastructures. The real penalty on P3 processors would
>be around 10%. At K7 about 50%.
>
>So for 32 bits to 64 bits *everything* that's getting 64 bits which first
>was 32 bits occupies more size. So i start losing like 10% to *start* with...
>
>Unless there are very GOOD reasons to get to 64 bits in the future i'll stick
>to 32 bits for quite a long time with at least 99% of the code!
>
>As 99% of all applications will be 32 bits hell sure processors will
>take care they are very fast with 32 bits code, so no problems there either!
>
>there are quite some problems to convert windows applications to 64 bits
>because 'int' in general is seen as a 32 bits thing!
>
>>I wouldn't suggest that anyone rewrite their working code.  But I made the
>>decision to do this several years ago.  I haven't found it a disadvantage on
>>32 bit machines, and I really doubt it will be a disadvantage on 64 bit
>>machines.
>
>factor 2.5 at 32 bits machines for sure.
>
>>you have to think "data density".  Which is what made our move generator and
>>some evaluations on the Cray so very fast.  But not until you start "thinking
>>outside the box" and asking "how can this architecture help me do things more
>>efficiently?" rather than "How can I make my program run on this architecture?"
>
>>Those are two vastly different questions.  With vastly different answers.
>
>Here we all agree, but for me putting things in general in a bitboard
>means i have 1 bit worth of information about something. That's too little
>information for me!
>
>So if i ever go use 64 bits then it's for those parts of my program where
>i can profit.
>
>What i still do not understand is why crafty is so slow on 64 bits sun
>machine. Everything is 64 bits there. For me a sun processor performs
>the same as a PII processor at the same speed.
>
>How is crafty performing at it?
>
>IMHO the only reason why crafty isn't so slow on intel processors is
>because you have way less branches as i have in DIEP.
>
>I am using complex patterns in DIEP's evaluation function, so everywhere
>there are branches. In crafty the branches that are there are easier
>to predict.
>
>So at 64 bits processors where the branch misprediction penalty isn't
>big, there crafty will perform very bad compared to DIEP.
>
>At an alpha the branch misprediction penalty is HUGE.
>
>So at a 21164 processor which had only 8kb L1 cache i did very bad with
>DIEP. At a 4 processor 21264 processor DIEP probably is doing worse as on
>a dual K7 1.5Ghz palomino which gets released next month hopefully.
>Operating at 1.5Ghz.
>
>Now the branch misprediction penalty of the K7 isn't very good. But
>the 21264 it's way worse. What 21264 has to prevent mispredictions
>a bit is a clever system of 2 killertables that reference each other.
>
>However the size of those tables is so small compared to the number of
>branches in DIEP, that it won't speed me up a single %!
>
>A 633Mhz 21164 processor performed for DIEP as a 380Mhz PII.
>
>Now that was some time ago of course. Nowadays DIEP would do worse on
>that 21164 when compared to the PII, because code+data size has become
>bigger.
>
>A similar thing is what i have in mind for the 21264. Where on paper
>it can do very good because it does 4 instructions a clock, it's very
>likely that a K7 of same Mhz is completely outgunning it for me.
>
>For crafty the 21264 is not so bad as it is for me,
>because you relatively suffer less
>from the HUGE branch misprediction
>penalty as i do with DIEP!
>
>It's not that the 64 bits datastructure is so good at 64 bits, the
>problem is that the 21264 design has a serious problem with branches!
>
>Best regards,
>Vincent
>
>
>
>Best regards,
>Vincent



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.