Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: BitBoard challenge to Ernst A. Heinz and bitboard fans - no moderation

Author: Robert Hyatt
Date: 18:54:06 01/17/99
On January 17, 1999 at 18:35:15, Vincent Diepeveen wrote:

>On January 17, 1999 at 00:22:38, Robert Hyatt wrote:
>
>>On January 16, 1999 at 14:37:49, Vincent Diepeveen wrote:
>>
>>>On January 15, 1999 at 22:09:04, Robert Hyatt wrote:
>>>
>>>>On January 15, 1999 at 21:26:17, Vincent Diepeveen wrote:
>>>>
>>>>>On January 13, 1999 at 07:19:54, Ernst A. Heinz wrote:
>>>>>
>>>>>>On January 13, 1999 at 06:13:30, Vincent Diepeveen wrote:
>>>>>>>
>>>>>>> [...]
>>>>>>
>>>>>>Vincent,
>>>>>>
>>>>>>As usual you spread your own fallible impressions and interpretations
>>>>>>in a style that suggests they are proven facts. But indeed they are *not*.
>>>>>
>>>>>I have not seen a single function using bitboards making attacktables
>>>>>for every square indeed.
>>>>>In crafty there is only this: a function that gives
>>>>>soemthing for a single square.
>>>>>
>>>>>I thought a long time about it.
>>>>
>>>>Vincent this is simply wrong, and your reasoning is badly flawed.  In early
>>>>versions of crafty I did _full_ attack bitmaps, so that at any point I had
>>>>the bitmaps for the squares attacked from a specific square, and bitmaps for
>>>>every square showing which squares were attacking _it_.  I kept these
>>>>incrementally updated.
>>>
>>>I talk in numbers Bob, not in "i have done something in the past".
>>>You generate a bunch of different bitboards. Cool with me.
>>>
>>>A) You miss number of attackers
>>>B) you miss where the attacker comes from
>>
>>Questions:  (1) have you read chess skill in man and machine? (2) Have you
>>read the comments in main.c (crafty)?  If the answer to either question is
>>yes, then your question makes no sense.  In 1976 chess 4.0 had both of the
>>things you ask for.  They kept an array of bitmaps called attacks_from[sq] that
>>had 1 bits set for each square the piece on sq attacks (this is like what I do
>>except I compute it on the fly faster than they incrementally updated this).
>
>We are not talking about doing things incrementally here or on the fly.
>
>Just bare speed of it.
>

Then as I said, _they_ (chess 4.x, Slate, Atkin) did it in the 1970's on a 14
MIP computer.  I did it in crafty.  So it is obviously doable.  I don't do it
now because I don't use that kind of data very often.

>>They also kept an array of bitmaps called attacks_to[sq] that had a one bit
>>set for each square that attacks sq.  If you PopCnt(attacks_to[sq]) you get
>>the number of attackers for that square.  If you want black or white attackers
>>you first AND to exclude opposing pieces.  So bitmaps give you A and B if you
>>want them.  Chess 4.x had them.  Early crafty versions had them.  So I fail to
>>see why you keep saying this can't be done?  IE just because I don't do it
>>doesn't mean I can't do it.  It means I don't _want_ to do it at present.
>
>You can do it but only very very very slow. I'm sure you would do it if
>it was fast in your datastructure. Gives you a big bunch of powerful info.

then you assume wrong.  Taking that out sped me up by 10%.  Putting it back
in would cost the same thing again.  But at present I don't do it because I
don't _need_ it.  I only do what I need.  And when I need it I do it.


>
>So saying now "I don't wanna do it and therefore shut up" has nothing
>to do with this.

By the same token, your saying "I need it so why don't you do it?  Because it
is too slow for you to do" is pure baloney...

>
>It's like we want to fly therefore i design a new wing, as
>a new engines got developed you say: "i'm not interested in that
>engine, my wing is perfect".

Maybe you have this _turned around_?  IE I am quite happy with _my_ wing
at present.  I am not telling _you_ that you need a new wing.  So please stop
telling me that _I_ need one.  I should be the judge of that?


>
>Where i adapt and say: "i want to use that new engine for my airplane
>so i design a new wing that works"


That's not quite how I'd put it.  More like "I'm using this brick for a wing
and it is the best way to do it because that is the way I am doing it."  Do
what you want.  Let others do the same.  In the end, the best approach 'wins'.



>
>>
>>
>>>
>>>Main thing is that i miss are numbers. Things can get
>>>measured you know.
>>
>>
>>both chess 4.x and crafty (early versions) had this.  I would think it could
>>be measured since we both did it.
>>
>>
>>
>>
>>>
>>>>I later decided that the 'latter' bitboard wasn't being used much and I got
>>>>rid of it as it did have a cost.  But that cost was some 10% overall speed,
>>>>which I could instantly get back if I wanted to use such bitmaps.
>>>
>>>>your bad logic seems to be "if it isn't in Crafty now, it isn't practical
>>>>or it can't be done."  On the contrary, if it isn't done in crafty now, it
>>>>is because I don't need (or don't want) the information at present.  Nothing
>>>>says others don't / can't do it...
>>>
>>>This has to do with how your tables look like. You need to
>>>add all your bitmaps before you can do something useful with it.
>>>That is a mainpoint. Please don't get personal, i am not insulting you
>>>personal. I'm just talking about
>>
>>
>>So, why can't I add to my bitmaps if I need to?  I don't see why that is
>>a problem.  I assume you have added to your data structures over the past
>>couple of years?  I can modify mine just as easily...
>
>It shows that bitboards are not useful in many aspects if you plan
>to expand and improve evaluation (which i consider weakest chain of
>programs) and it was already clear, although you are very good
>in writing info around functions, that it is not an easy way of working
>with datastructure.
>
>This 'way of working is tough with bitboards' is of course an opinion,
>which we probably disagree in, although it took years for you and ernst
>just to be able to work with it as you wrote in previous msg which shows my
>point, where the simple way of representing things everyone can work at once
>with.

So what?  Parallel programming is orders of magnitude more difficult than
learning bitboards.  I'm doing just fine with it.  How is yours coming?  I'd
say I have had _no_ problems adapting to new programming issues and ideas.


>
>Now apart from this opinion we have bare speed facts.
>
>Suppose everyone would have a computer C where a datastructure X is way
>way faster for your program than a datastructure Y.
>
>Now without further info i would simply use data structure X, no questions about
>it.
>


yes, but we don't have that situation.


>
>
>>>a) how difficult bitboards are to work with
>>>b) that you need 64 bits machines for it which are very expensive
>>>   unless you want to buy an outdated 21164 533Mhz chip, with say a 1 gig
>>>   harddisk in it (this advertisement here i see in this magazine offers
>>>   it for the prize of 6000 dutch guilder (3500 dollar) a 1 GIG harddisk in
>>>   it!! and 64 MB RAM).
>>>
>>>c) that after the big effort of a and b it is not possible to do everything
>>>   quickly with it and that it is definitely not faster.
>>
>>Maybe it isn't faster for _you_ to do things faster.  My speed isn't bad.  When
>
>It isn't bad, that's sure.
>
>>I run on a 64 bit machine my program goes twice as fast.  What does _yours_ get
>>when you do so?  If you want to write for 32 bit machines (past and present)
>>that's fine.  I choose to write for the future.  And the future _is_ 64 bits.
>>Just wait 5 years and see if you can find a 32 bit machine left.
>
>Wait 5 years and you need a better evaluation.
>Try evaluating with bitboards then. Bitboards is then past.

I'll make a wager with you.  That I am still using bitboards in 5 years, and
that I'll still win more games than I lose when we play.  I see _no_ sort of
asymptotic limit.  Quite the contrary I see lots of speed coming with 64 bit
machines.


>
>If i would use 64 bits my program goes way faster than it is now,
>assuming that within 5 years there are some very good 64 bits compilers,
>which is not a bad assumption. There is still no good PII/PRO
>compiler right now.
>
>And about the speed difference, our speed difference is currently more
>than a factor 2 at a 32 bits machine when you would do the same in evaluation.
>
>It makes no sense slowing down a program more than 2 times in order to
>get it 2 times faster at a 64 bits machine, that's how i see it.
>
>>>>>>The "problems" (rumors?) you allude to in the post only show that you have
>>>>>>*not* understood what bitboards are capable of and how to handle them
>>>>>>effciently. You are of course free to criticize Bob's coding style and his
>>>>>>design decisions in "Crafty". However, only inexperienced and narrow-minded
>>>>>>people draw general conclusions from just a single example (i.e. the source
>>>>>
>>>>>Now comon comon, who is narrowminded here.
>>>>>I've investigated bitboards just as close as you did.
>>>>>
>>>>>Note that i don't have a 64 bits processor, and even if i did, then still
>>>>>i never might sell a 64 bits program, where never is defined as next
>>>>>few years.
>>>>>
>>>>
>>>>I doubt you have 'investigated bitboards just as close as you did' (you talking
>>>>to Ernst here).  It took me a _long_ time to become competent with them.  It
>>>
>>>We don't doubt your insight in bitboards Bob. Please don't doubt my insight
>>>either. That's a bad way of writing emails.
>>
>>I know my programming skills.  And I know it took me several months to become
>>'bitmap-friendly'.  I doubt you are better than I am.  No reason to doubt you
>>are not as good, of course, but better?  I doubt it.  Which means you haven't
>>'studied' them until you have _done_ them.  And I'm not talking about a simple
>>pawn structure bitmap.  We had a single bitmap in cray blitz to detect in check
>>stuff.  That is _not_ the same thing.
>
>The time you needed to become 'bitmap' friendly says enough about
>friendlyness of developping a program using bitmaps only.

So?  The time needed to become a "parallel searcher" is even longer...  But
it is still worthwhile.  Difficult or not.  I don't worry about 'easy'.  I
worry about 'viable'
.



>
>My bitmap is different from crafty bitmap. dunno how you have implemented
>it cray blitz. Don't have the source of it. Would love to have that source
>code: diep@xs4all.nl. you told you spreaded it at the cray computer mailing
>list. I'm not at that list, please email it to: diep@xs4all.nl
>
>>>Don't forget i use a single bitboard for speeding up my pawn structure
>>>evaluation. I take the best of all worlds. That's why
>>>in the rest or the program i don't use bitboards.
>
>
>>Remember, the man who is blind can't appreciate art.  Maybe you just don't
>
>Don't compare datastructure with art too much.


I wasn't.  You haven't tried bitmaps to the point where you feel comfortable
with them. So of course you are going to see problems where I only see
solutions.



>
>If you say: my datastructure is
>a very original one, i invented it myselve, and i don't care whether it's
>faster or slower, then i respect you completely.

I didn't invent them, they are 30 years old.


>
>If you say: "my datastructure is a very good one to represent a chessboard,
>just wait till within 5 years you also have a 64 bits machine then it rocks,"
>then i completely disagree with you!
>
>I also wonder: when a 64 bits machine is really that fast to you, why
>buy a quad Xeon for a big bunch of money instead of a quad 21164 533Mhz alpha,
>which is only 300 dollar a processor x 4 = 1200 dollar for processors
>where quad Xeon is a big multiple of it.

Do the math.  There are no quad alphas available, except thru Digital, and
I can't buy them for $10,000.00...  not even close.  I have _run_ on multiple
alphas.  With any luck, you'll be on the wrong end of one in Paderborn.



>
>>understand them well enough yet.  They are an alternative data structure,
>>nothing more or less.  Anything you can do I can do.  Anything I can do, you
>>can do.  No more, no less.
>
>Not at the same speed, and also you need way more time and need
>way more time writing code, when developing things using bitboards!
>


absolutely not.  The learning curve is the only difference.  Once used to
them, coding is _not_ hard... in fact it is often easier, because pattern
matching is easier/faster.



>>>>did _not" come by just taking a casual look.  I had to start writing code and
>>>>spend a year figuring out how to do everything I wanted to do efficiently.  An
>>>>example:  I teach a programming language course and cover some esoteric things
>>>>like APL, prolog, etc.  And when I assign a program in (say) APL to students,
>>>>I get "C" programs back.  Syntactically APL.  But algorithmically C.  That is
>>>>what happens with bitboards for a good while, _until_ the writer becomes used to
>>>>'thinking bitboards'. (or thinking parallel search or whatever).  These topics
>>>>are _not_ something you spend a day or two thinking about and then say "been
>>>>there, done that, it's no good..."
>>>
>>>I'm not busy one day with my program either. We don't doubt your experience,
>>>that still however doesn't give the objective reader clear numbers about
>>>WHETHER you can do it.
>>>
>>>It is very simple. Just dig up your old code and test it at your Xeon.
>>>matter of a few minutes to run the old code imho and with your deep insight
>>>easy to initialize the demanded arrays, then please count speed of it,
>>
>>
>>what 'old code'?  incremental versions of crafty?  why?
>
>>>leave the code in crafty as a 'testroutine' and we know.
>>>
>>>>
>>>>>>code of "Crafty"). There are many ways to do things differently with bitboards
>>>>>>and rotated bitboards than in "Crafty". Overall, your post mainly elaborates
>>>>>>on some of Bob's design decisions which you deem non-optimal.
>>>>>>
>>>>>>The only real disadvantage of bitboards at the moment is the obvious penalty
>>>>>>of 64-bit integers on 32-bit CPUs which comes hand in hand with unfortunately
>>>>>>limited compilers and programming environments.
>>>>>
>>>>>I'm waiting for your implementation of attacktables using bitboards.
>>>>>
>>>>>I'll even be friendly to you allowing it to do at a 64 bits machine,
>>>>>which next 5 years is probably out of wallet of 99% of the people,
>>>>>but then please don't use macros, but write it all out. Macros confuse people,
>>>>>giving them the idea that it is simple using bitboards, where actually
>>>>>every macro is a kind of hacked thing in bitboards imho. Please post again
>>>>>how you count the number of bits in a bitboard?
>>>>>
>>>>
>>>>
>>>>well known:
>>>>
>>>> int c;
>>>> while (bitboard) {
>>>>   c++;
>>>>   bitboard=bitboard&(bitboard-1);
>>>> }
>>>> return(c);
>>>
>>>>is best for small numbers of bits.
>>>
>>>>Or on intel machines, even better is this piece of code:
>>>
>>>>PopCnt:
>>>>        movl    4(%esp), %ecx
>>>>        xorl    %eax, %eax
>>>>        testl   %ecx, %ecx
>>>>        jz      l1
>>>>l0:
>>>>        leal    -1(%ecx), %edx
>>>>        incl    %eax
>>>>        andl    %edx, %ecx
>>>>        jnz     l0
>>>>l1:
>>>>        movl    8(%esp), %ecx
>>>>        testl   %ecx, %ecx
>>>>        jz      l3
>>>>l2:
>>>>        leal    -1(%ecx), %edx
>>>>        incl    %eax
>>>>        andl    %edx, %ecx
>>>>        jnz     l2
>>>>l3:
>>>>        ret
>>>>
>>>>that is _very_ fast...
>>>
>>>
>>>>>At a PII-450 diep generates 250k attacktables a second in the position
>>>>>after openingsposition+1.e4,e5 2.d4,d5
>>>>>
>>>>>I'm only using C code. Not a single assembler instruction is in my code,
>>>>>as far as i know your bitboards are written in C? In that case
>>>>>there is no unfair comparision.
>>>>>
>>>>>An attacktable is a table in my program of
>>>>>int attacktable [2][64];
>>>>>First i clean the attacktable, then i store for every piece all attacks.
>>>>>
>>>>>that includes
>>>>>  a) piece that attacks      (what square it is from)
>>>>>  b) number of attackers     (clear i guess)
>>>>>  c) kind of piece           (pawn,knight,bishop ...)
>>>>>
>>>>>I need this to evaluate in my program, as it evaluates also 'relationships'
>>>>>between pieces and mobility is also depending upon it.
>>>>>
>>>>>My mobility goes like this: i loop over all squares of a piece, including
>>>>>xray squares and then find out how important control over such a square is
>>>>>depending on all kind of chessknowlege where info from attacktable is
>>>>>used using lookup of attacktable[c][sq] where sq are all squares that
>>>>>need to be considered. And as i use this 'attacktable' info a lot,
>>>>>i therefore store it in this a simple way in [2][64] array.
>>>>>Every time i use something out of my attacktable i definitely not
>>>>>am gonna add all values of the different bitboards *just* to get the
>>>>>number of attackers. That would slow down my program *considerable*.
>>>>>
>>>>>Now a problem using attacktables with mobility is:
>>>>>if you do all squares at once (otherwise bitboards not rewarding),
>>>>>that is kind of tough, considering that values are different
>>>>>for every square. So where i simply count a+b+c, where probably
>>>>>a != b != c then you run into problems, as for a range of -50 to +50
>>>>>you need to do 101 bitboard counts.
>>>>>
>>>>
>>>>
>>>>I've already pointed out that this is false.  Here's how to do this _fast_ on
>>>>the cray:
>>>
>>>>first compute the bitmap for the squares a piece on "sq" attacks.  This produces
>>>>a set of zeros and ones.  Then take a static const array sqval[64] where you
>>>>have filled in the value of attacking each square.  IE values for e4 d5 and
>>>>so forth, and _every_ value can be different.
>>>
>>>>ON the cray, we use the vector load instruction, but from this sqval[] array
>>>>we only load the values with 1's set in the according bitmap.  We sum these
>>>>and we are done, and it takes roughly 20 clock cycles to get the first value
>>>
>>>The problem is that according bitmaps. If value can range from -50 to 50
>>>then you need to do this 101 times, because you need a bitcount for
>>>every value.
>>
>>NO NO NO.  on the cray, I load the attack bitmap.  One instruction.  I then
>>do a vector load to grab the values for mobility (different value for each
>>square if you want.)  I then sum those values.  I do the *whole* thing in
>>under 50 clock cycles.  *whole thing*.  No loops of any kind either...  And I
>>do this while I am doing other things as well... so I don't even notice the
>>'expense'...
>
>Please don't confuse the cray-many-million-dollar-computer with nowadays
>hardware.



_You_ are the one that keeps saying "this can not be done with bitmaps."  I
simply showed that not only 'can' it be done with bitmaps, but on hardware that
_really_ supports bitmaps it can be done so much faster than what you are doing
that it isn't even funny to think about.


>
>
>>>>from the sq array, and assuming you have 20 squares you attack, the entire
>>>>thing will complete in about 50 clock ticks _total_.
>>>>Not very inefficient.  So things you think are hard are actually trivial on the
>>>>right hardware...
>>>
>>>A rude way of mobility can be done quickly. NO doubt. Please don't talk
>>>about the cray. We're talking general audience here. General audience
>>>doesn't have a cray nor alpha 533 and also will not buy it.
>>>
>>>In that respect crafty does a good job as it can be compiled at all
>>>platforms.
>>>
>>>>>I have given you exact data here.
>>>>>2 challenges are posted, from which one is very clear.
>>>>>The second one, mobility is simply impossible, unless you use the same
>>>>>code i use, so you get every square out of the bitboard and then run
>>>>>through the evaluation code, which means that the loop used over the squares
>>>>>is different, but not the evaluation code
>>>>
>>>>better rethink that 'impossible' as I just did it and I did it many times
>>>>faster than you can.  On the cray of course.  Or on the intel i860/960...
>>>>which are commodity micros.
>>>
>>>I fear you didn't get the point, or simply are hiding it.
>>>
>>>I have a big range of values here and a big bunch of chessknowledge
>>>which decides which square gets what value, and that chessknowledge
>>>is based on piece relationships and attackers and a lot of other things.
>>>
>>>Please don't compare that with a rude "how many moves can i make" count.
>
>>read the above.  Generate the array of scores for each square.  I can
>>extract and add them faster than you can.
>
>At the cray computer perhaps dear sir,
>we were talking rotated bitboards as you use at the PII.
>
>Please reanswer it for PII.

we weren't talking about the PII.  We were talking about bitmaps period.  I
believe you will find that 64 bit machines 5 years from now have the same
sort of vector capabilities as crays do now.  There are already machines that
do.




>
>>>I don't doubt we all generate that quickly. Actually in DIEP i generate
>>>15 million move generations a second after 1.e4,e5 2.d4,d5 at a PII-450
>
>
>>>>>Assuming in all cases lossless code,
>>>
>>>This was my point. Your code is lossy to mine.
>>
>>
>>
>>what is 'lossy' code?  concept makes no sense to me in this context.
>
>he he he he he,
>I thought you gave lessons as a professor in things like that...
>


I do.  I teach about "lossless and lossy joins in database theory."  But
that implies 'losing something'.  I lose _nothing_.




>>
>>>
>>>>>Greetings,
>>>>>Vincent
>>>>>
>>>>>>=Ernst=
>>>>>
>>>>>Please guys don't moderate Ernst too tough, if you do, then i dunno
>>>>>what Heinz thinks of himselve.
Re: BitBoard challenge to Ernst A. Heinz and bitboard fans - no moderation Vincent Diepeveen 08:40:27 01/18/99
- Re: BitBoard challenge to Ernst A. Heinz and bitboard fans - no moderation Robert Hyatt 12:59:05 01/18/99
- Re: BitBoard challenge to Ernst A. Heinz and bitboard fans - no moderation Larry Griffiths 09:37:07 01/18/99
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.