Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BitBoard challenge to Ernst A. Heinz and bitboard fans - no moderation

Author: Vincent Diepeveen

Date: 15:35:15 01/17/99

Go up one level in this thread


On January 17, 1999 at 00:22:38, Robert Hyatt wrote:

>On January 16, 1999 at 14:37:49, Vincent Diepeveen wrote:
>
>>On January 15, 1999 at 22:09:04, Robert Hyatt wrote:
>>
>>>On January 15, 1999 at 21:26:17, Vincent Diepeveen wrote:
>>>
>>>>On January 13, 1999 at 07:19:54, Ernst A. Heinz wrote:
>>>>
>>>>>On January 13, 1999 at 06:13:30, Vincent Diepeveen wrote:
>>>>>>
>>>>>> [...]
>>>>>
>>>>>Vincent,
>>>>>
>>>>>As usual you spread your own fallible impressions and interpretations
>>>>>in a style that suggests they are proven facts. But indeed they are *not*.
>>>>
>>>>I have not seen a single function using bitboards making attacktables
>>>>for every square indeed.
>>>>In crafty there is only this: a function that gives
>>>>soemthing for a single square.
>>>>
>>>>I thought a long time about it.
>>>
>>>Vincent this is simply wrong, and your reasoning is badly flawed.  In early
>>>versions of crafty I did _full_ attack bitmaps, so that at any point I had
>>>the bitmaps for the squares attacked from a specific square, and bitmaps for
>>>every square showing which squares were attacking _it_.  I kept these
>>>incrementally updated.
>>
>>I talk in numbers Bob, not in "i have done something in the past".
>>You generate a bunch of different bitboards. Cool with me.
>>
>>A) You miss number of attackers
>>B) you miss where the attacker comes from
>
>Questions:  (1) have you read chess skill in man and machine? (2) Have you
>read the comments in main.c (crafty)?  If the answer to either question is
>yes, then your question makes no sense.  In 1976 chess 4.0 had both of the
>things you ask for.  They kept an array of bitmaps called attacks_from[sq] that
>had 1 bits set for each square the piece on sq attacks (this is like what I do
>except I compute it on the fly faster than they incrementally updated this).

We are not talking about doing things incrementally here or on the fly.

Just bare speed of it.

>They also kept an array of bitmaps called attacks_to[sq] that had a one bit
>set for each square that attacks sq.  If you PopCnt(attacks_to[sq]) you get
>the number of attackers for that square.  If you want black or white attackers
>you first AND to exclude opposing pieces.  So bitmaps give you A and B if you
>want them.  Chess 4.x had them.  Early crafty versions had them.  So I fail to
>see why you keep saying this can't be done?  IE just because I don't do it
>doesn't mean I can't do it.  It means I don't _want_ to do it at present.

You can do it but only very very very slow. I'm sure you would do it if
it was fast in your datastructure. Gives you a big bunch of powerful info.

So saying now "I don't wanna do it and therefore shut up" has nothing
to do with this.

It's like we want to fly therefore i design a new wing, as
a new engines got developed you say: "i'm not interested in that
engine, my wing is perfect".

Where i adapt and say: "i want to use that new engine for my airplane
so i design a new wing that works"

>
>
>>
>>Main thing is that i miss are numbers. Things can get
>>measured you know.
>
>
>both chess 4.x and crafty (early versions) had this.  I would think it could
>be measured since we both did it.
>
>
>
>
>>
>>>I later decided that the 'latter' bitboard wasn't being used much and I got
>>>rid of it as it did have a cost.  But that cost was some 10% overall speed,
>>>which I could instantly get back if I wanted to use such bitmaps.
>>
>>>your bad logic seems to be "if it isn't in Crafty now, it isn't practical
>>>or it can't be done."  On the contrary, if it isn't done in crafty now, it
>>>is because I don't need (or don't want) the information at present.  Nothing
>>>says others don't / can't do it...
>>
>>This has to do with how your tables look like. You need to
>>add all your bitmaps before you can do something useful with it.
>>That is a mainpoint. Please don't get personal, i am not insulting you
>>personal. I'm just talking about
>
>
>So, why can't I add to my bitmaps if I need to?  I don't see why that is
>a problem.  I assume you have added to your data structures over the past
>couple of years?  I can modify mine just as easily...

It shows that bitboards are not useful in many aspects if you plan
to expand and improve evaluation (which i consider weakest chain of
programs) and it was already clear, although you are very good
in writing info around functions, that it is not an easy way of working
with datastructure.

This 'way of working is tough with bitboards' is of course an opinion,
which we probably disagree in, although it took years for you and ernst
just to be able to work with it as you wrote in previous msg which shows my
point, where the simple way of representing things everyone can work at once
with.

Now apart from this opinion we have bare speed facts.

Suppose everyone would have a computer C where a datastructure X is way
way faster for your program than a datastructure Y.

Now without further info i would simply use data structure X, no questions about
it.



>>a) how difficult bitboards are to work with
>>b) that you need 64 bits machines for it which are very expensive
>>   unless you want to buy an outdated 21164 533Mhz chip, with say a 1 gig
>>   harddisk in it (this advertisement here i see in this magazine offers
>>   it for the prize of 6000 dutch guilder (3500 dollar) a 1 GIG harddisk in
>>   it!! and 64 MB RAM).
>>
>>c) that after the big effort of a and b it is not possible to do everything
>>   quickly with it and that it is definitely not faster.
>
>Maybe it isn't faster for _you_ to do things faster.  My speed isn't bad.  When

It isn't bad, that's sure.

>I run on a 64 bit machine my program goes twice as fast.  What does _yours_ get
>when you do so?  If you want to write for 32 bit machines (past and present)
>that's fine.  I choose to write for the future.  And the future _is_ 64 bits.
>Just wait 5 years and see if you can find a 32 bit machine left.

Wait 5 years and you need a better evaluation.
Try evaluating with bitboards then. Bitboards is then past.

If i would use 64 bits my program goes way faster than it is now,
assuming that within 5 years there are some very good 64 bits compilers,
which is not a bad assumption. There is still no good PII/PRO
compiler right now.

And about the speed difference, our speed difference is currently more
than a factor 2 at a 32 bits machine when you would do the same in evaluation.

It makes no sense slowing down a program more than 2 times in order to
get it 2 times faster at a 64 bits machine, that's how i see it.

>>>>>The "problems" (rumors?) you allude to in the post only show that you have
>>>>>*not* understood what bitboards are capable of and how to handle them
>>>>>effciently. You are of course free to criticize Bob's coding style and his
>>>>>design decisions in "Crafty". However, only inexperienced and narrow-minded
>>>>>people draw general conclusions from just a single example (i.e. the source
>>>>
>>>>Now comon comon, who is narrowminded here.
>>>>I've investigated bitboards just as close as you did.
>>>>
>>>>Note that i don't have a 64 bits processor, and even if i did, then still
>>>>i never might sell a 64 bits program, where never is defined as next
>>>>few years.
>>>>
>>>
>>>I doubt you have 'investigated bitboards just as close as you did' (you talking
>>>to Ernst here).  It took me a _long_ time to become competent with them.  It
>>
>>We don't doubt your insight in bitboards Bob. Please don't doubt my insight
>>either. That's a bad way of writing emails.
>
>I know my programming skills.  And I know it took me several months to become
>'bitmap-friendly'.  I doubt you are better than I am.  No reason to doubt you
>are not as good, of course, but better?  I doubt it.  Which means you haven't
>'studied' them until you have _done_ them.  And I'm not talking about a simple
>pawn structure bitmap.  We had a single bitmap in cray blitz to detect in check
>stuff.  That is _not_ the same thing.

The time you needed to become 'bitmap' friendly says enough about
friendlyness of developping a program using bitmaps only.

My bitmap is different from crafty bitmap. dunno how you have implemented
it cray blitz. Don't have the source of it. Would love to have that source
code: diep@xs4all.nl. you told you spreaded it at the cray computer mailing
list. I'm not at that list, please email it to: diep@xs4all.nl

>>Don't forget i use a single bitboard for speeding up my pawn structure
>>evaluation. I take the best of all worlds. That's why
>>in the rest or the program i don't use bitboards.


>Remember, the man who is blind can't appreciate art.  Maybe you just don't

Don't compare datastructure with art too much.

If you say: my datastructure is
a very original one, i invented it myselve, and i don't care whether it's
faster or slower, then i respect you completely.

If you say: "my datastructure is a very good one to represent a chessboard,
just wait till within 5 years you also have a 64 bits machine then it rocks,"
then i completely disagree with you!

I also wonder: when a 64 bits machine is really that fast to you, why
buy a quad Xeon for a big bunch of money instead of a quad 21164 533Mhz alpha,
which is only 300 dollar a processor x 4 = 1200 dollar for processors
where quad Xeon is a big multiple of it.

>understand them well enough yet.  They are an alternative data structure,
>nothing more or less.  Anything you can do I can do.  Anything I can do, you
>can do.  No more, no less.

Not at the same speed, and also you need way more time and need
way more time writing code, when developing things using bitboards!

>>>did _not" come by just taking a casual look.  I had to start writing code and
>>>spend a year figuring out how to do everything I wanted to do efficiently.  An
>>>example:  I teach a programming language course and cover some esoteric things
>>>like APL, prolog, etc.  And when I assign a program in (say) APL to students,
>>>I get "C" programs back.  Syntactically APL.  But algorithmically C.  That is
>>>what happens with bitboards for a good while, _until_ the writer becomes used to
>>>'thinking bitboards'. (or thinking parallel search or whatever).  These topics
>>>are _not_ something you spend a day or two thinking about and then say "been
>>>there, done that, it's no good..."
>>
>>I'm not busy one day with my program either. We don't doubt your experience,
>>that still however doesn't give the objective reader clear numbers about
>>WHETHER you can do it.
>>
>>It is very simple. Just dig up your old code and test it at your Xeon.
>>matter of a few minutes to run the old code imho and with your deep insight
>>easy to initialize the demanded arrays, then please count speed of it,
>
>
>what 'old code'?  incremental versions of crafty?  why?

>>leave the code in crafty as a 'testroutine' and we know.
>>
>>>
>>>>>code of "Crafty"). There are many ways to do things differently with bitboards
>>>>>and rotated bitboards than in "Crafty". Overall, your post mainly elaborates
>>>>>on some of Bob's design decisions which you deem non-optimal.
>>>>>
>>>>>The only real disadvantage of bitboards at the moment is the obvious penalty
>>>>>of 64-bit integers on 32-bit CPUs which comes hand in hand with unfortunately
>>>>>limited compilers and programming environments.
>>>>
>>>>I'm waiting for your implementation of attacktables using bitboards.
>>>>
>>>>I'll even be friendly to you allowing it to do at a 64 bits machine,
>>>>which next 5 years is probably out of wallet of 99% of the people,
>>>>but then please don't use macros, but write it all out. Macros confuse people,
>>>>giving them the idea that it is simple using bitboards, where actually
>>>>every macro is a kind of hacked thing in bitboards imho. Please post again
>>>>how you count the number of bits in a bitboard?
>>>>
>>>
>>>
>>>well known:
>>>
>>> int c;
>>> while (bitboard) {
>>>   c++;
>>>   bitboard=bitboard&(bitboard-1);
>>> }
>>> return(c);
>>
>>>is best for small numbers of bits.
>>
>>>Or on intel machines, even better is this piece of code:
>>
>>>PopCnt:
>>>        movl    4(%esp), %ecx
>>>        xorl    %eax, %eax
>>>        testl   %ecx, %ecx
>>>        jz      l1
>>>l0:
>>>        leal    -1(%ecx), %edx
>>>        incl    %eax
>>>        andl    %edx, %ecx
>>>        jnz     l0
>>>l1:
>>>        movl    8(%esp), %ecx
>>>        testl   %ecx, %ecx
>>>        jz      l3
>>>l2:
>>>        leal    -1(%ecx), %edx
>>>        incl    %eax
>>>        andl    %edx, %ecx
>>>        jnz     l2
>>>l3:
>>>        ret
>>>
>>>that is _very_ fast...
>>
>>
>>>>At a PII-450 diep generates 250k attacktables a second in the position
>>>>after openingsposition+1.e4,e5 2.d4,d5
>>>>
>>>>I'm only using C code. Not a single assembler instruction is in my code,
>>>>as far as i know your bitboards are written in C? In that case
>>>>there is no unfair comparision.
>>>>
>>>>An attacktable is a table in my program of
>>>>int attacktable [2][64];
>>>>First i clean the attacktable, then i store for every piece all attacks.
>>>>
>>>>that includes
>>>>  a) piece that attacks      (what square it is from)
>>>>  b) number of attackers     (clear i guess)
>>>>  c) kind of piece           (pawn,knight,bishop ...)
>>>>
>>>>I need this to evaluate in my program, as it evaluates also 'relationships'
>>>>between pieces and mobility is also depending upon it.
>>>>
>>>>My mobility goes like this: i loop over all squares of a piece, including
>>>>xray squares and then find out how important control over such a square is
>>>>depending on all kind of chessknowlege where info from attacktable is
>>>>used using lookup of attacktable[c][sq] where sq are all squares that
>>>>need to be considered. And as i use this 'attacktable' info a lot,
>>>>i therefore store it in this a simple way in [2][64] array.
>>>>Every time i use something out of my attacktable i definitely not
>>>>am gonna add all values of the different bitboards *just* to get the
>>>>number of attackers. That would slow down my program *considerable*.
>>>>
>>>>Now a problem using attacktables with mobility is:
>>>>if you do all squares at once (otherwise bitboards not rewarding),
>>>>that is kind of tough, considering that values are different
>>>>for every square. So where i simply count a+b+c, where probably
>>>>a != b != c then you run into problems, as for a range of -50 to +50
>>>>you need to do 101 bitboard counts.
>>>>
>>>
>>>
>>>I've already pointed out that this is false.  Here's how to do this _fast_ on
>>>the cray:
>>
>>>first compute the bitmap for the squares a piece on "sq" attacks.  This produces
>>>a set of zeros and ones.  Then take a static const array sqval[64] where you
>>>have filled in the value of attacking each square.  IE values for e4 d5 and
>>>so forth, and _every_ value can be different.
>>
>>>ON the cray, we use the vector load instruction, but from this sqval[] array
>>>we only load the values with 1's set in the according bitmap.  We sum these
>>>and we are done, and it takes roughly 20 clock cycles to get the first value
>>
>>The problem is that according bitmaps. If value can range from -50 to 50
>>then you need to do this 101 times, because you need a bitcount for
>>every value.
>
>NO NO NO.  on the cray, I load the attack bitmap.  One instruction.  I then
>do a vector load to grab the values for mobility (different value for each
>square if you want.)  I then sum those values.  I do the *whole* thing in
>under 50 clock cycles.  *whole thing*.  No loops of any kind either...  And I
>do this while I am doing other things as well... so I don't even notice the
>'expense'...

Please don't confuse the cray-many-million-dollar-computer with nowadays
hardware.


>>>from the sq array, and assuming you have 20 squares you attack, the entire
>>>thing will complete in about 50 clock ticks _total_.
>>>Not very inefficient.  So things you think are hard are actually trivial on the
>>>right hardware...
>>
>>A rude way of mobility can be done quickly. NO doubt. Please don't talk
>>about the cray. We're talking general audience here. General audience
>>doesn't have a cray nor alpha 533 and also will not buy it.
>>
>>In that respect crafty does a good job as it can be compiled at all
>>platforms.
>>
>>>>I have given you exact data here.
>>>>2 challenges are posted, from which one is very clear.
>>>>The second one, mobility is simply impossible, unless you use the same
>>>>code i use, so you get every square out of the bitboard and then run
>>>>through the evaluation code, which means that the loop used over the squares
>>>>is different, but not the evaluation code
>>>
>>>better rethink that 'impossible' as I just did it and I did it many times
>>>faster than you can.  On the cray of course.  Or on the intel i860/960...
>>>which are commodity micros.
>>
>>I fear you didn't get the point, or simply are hiding it.
>>
>>I have a big range of values here and a big bunch of chessknowledge
>>which decides which square gets what value, and that chessknowledge
>>is based on piece relationships and attackers and a lot of other things.
>>
>>Please don't compare that with a rude "how many moves can i make" count.

>read the above.  Generate the array of scores for each square.  I can
>extract and add them faster than you can.

At the cray computer perhaps dear sir,
we were talking rotated bitboards as you use at the PII.

Please reanswer it for PII.

>>I don't doubt we all generate that quickly. Actually in DIEP i generate
>>15 million move generations a second after 1.e4,e5 2.d4,d5 at a PII-450


>>>>Assuming in all cases lossless code,
>>
>>This was my point. Your code is lossy to mine.
>
>
>
>what is 'lossy' code?  concept makes no sense to me in this context.

he he he he he,
I thought you gave lessons as a professor in things like that...

>
>>
>>>>Greetings,
>>>>Vincent
>>>>
>>>>>=Ernst=
>>>>
>>>>Please guys don't moderate Ernst too tough, if you do, then i dunno
>>>>what Heinz thinks of himselve.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.