Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Bitboards and Evaluation

Author: Robert Hyatt
Date: 13:49:35 05/31/01
On May 31, 2001 at 12:17:50, Vincent Diepeveen wrote:

>On May 31, 2001 at 10:37:25, Robert Hyatt wrote:
>
>>On May 31, 2001 at 08:09:52, Vincent Diepeveen wrote:
>>
>>>Hello i had in mind comparing assembly output of 2 things
>>>but now i look in crafty 18.9 i see the pattern is not even
>>>in crafty 18.9, though it's a basic pattern. so it's hard
>>>to compare things.
>>>
>>>But i'm pretty amazed by that every thing is
>>>getting referenced as
>>>   tree->pos.board[sq]
>>>
>>>If i would be roman catholic i would now make a cross and say
>>>"lucky i'm multiprocessor",
>>>
>>>because what i would be using there is
>>>   board[sq]
>>>
>>>And i'm using that everywhere in my evaluation. Bob
>>>however smartly got rid of the thing by using a define
>>>that however translatest to it PcOnSq() it's called.
>>>
>>>But in the assembly code you still see it!
>>>
>>>Also what i see is the general problem of bitboards:
>>>  if( (something[indexed]&bitmask) == pattern )
>>>
>>>Where i can do
>>>  if( something[indexed] == pattern )
>>>
>>>So i save an AND there.
>>
>>come on.  Show me your one if to see if a pawn is isolated.  Or if it is
>>passed.  Or if it is passed and can't be caught by the opposing king.  Or
>>if your two rooks or rook/queen are connected on the 7th rank...  or if you
>>have the "absolute 7th"...
>
>I'm doing that faster too. Yes with something that you would call
>a bitboard.
>
>But i do it in 32 bits.
>
>Noop i'm not a GNU program, so i don't post source here.
>
>>you are comparing apples to oranges...
>
>>>
>>>Also i'm pretty amazed by 'signed char bval_w[64]'.
>>>
>>>First of all in DIEP i am happy to announce that i threw out all
>>>8 bits arrays.
>>>
>>>I didn't know crafty is still using 8 bits arrays!
>>>I thought it was a mixture of 32 bits with 64 bits!
>>
>>
>>Vincent, I suspect there is a _lot_ you don't know.  :)  I use them because
>>they are faster on the Intel hardware.  The test to prove it is quite simple.
>
>Yes 8 bits is faster, i think i mentionned it several times that
>i slowed down my program by removing all 8 bits arrays.
>
>For the same reason 32 bits is than 64 bits!



Nope, here your reasoning is flawed and _dead wrong_.

I use 8 bit values because I only need 8 bit values.  The Intel CPU
internally passes nothing but 32 and 64 bit values around.  My 8 bit
values are cheaper, _only_ because I am being more efficient in cache.

In the 64 bit stuff I do, I _need_ 64 bits.  And the Alpha/IA64 is moving
64 bits around at a time.  A perfect match.  If I need 8 bit values, it might
be a bit faster to use only 8 bit values on the alpha, as it would be more
cache-friendly.  But that is the only gain.  There is definitely a "loss"
by gating around 64 bit values internally when they only have 8 bits of useful
information in them.

That is a loss in performance, period.

As I have said many times, you _must_ look at the architecture you are working
on, and try to figure out how to use it to help you run faster.  Not just look
at it to figure out how to run your program on it.  If you take that approach
you would _never_ get any performance out of a Cray.


>
>>64 bytes is 2 cache lines.  64 words is 8 cache lines.  It's just that simple.
>
>>>The second thing i wonder about is why this is getting done *every*
>>>evaluation. bval_w gives a piece square table value which is constant
>>>for bishops.
>>>
>>>You can do that incremental in makemove/unmakemove !!
>>
>>
>>Have you read my reasons for doing this in the past?  Apparently not.  So
>>one more time:
>>
>>"I do everything in evaluate.c to make it easy to change the code.  If I do
>>things incrementally, then I have to modify _two_ pieces of code when I change
>>it.  Modifying one thing is easier.  I'm worried less about speed than I am
>>about quality.  I don't know, for example, that the bishop piece/square table
>>will always exist.  In fact, I am pretty sure it won't.
>
>Why use so many inline assembly then if quality is what you care for
>and/or mix 8 bits with 32 bits and 64 bits if quality is what you care
>for?

There is not "so many inline assembly".  All that has been done are about 500
lines of asm to do the 64 bit functions like FirstOne() and LastOne() because
the compiler doesn't know how to use the hardware instructions for whatever
reason.  I don't call that "so much".  And as I said before, it doesn't make
a _huge_ speed difference anyway.




>
>>>
>>>This is a pure waste of system time!
>>>
>>>Note in DIEP i would do that in my makemove as:
>>>  int *p;
>>>  global int tablescore;
>>>  p = psq[piece];
>>>
>>>  tablescore += p[to_square]-p[from_square];
>>>
>>>Crafty does it every evaluation!!!
>>>
>>>Bob something to improve in your evaluation!
>>
>>Nope.  One day as pieces of the evaluation become "static" and don't change
>>anymore, some of it _might_ get done incrementally.  But in endgames, your
>>"idea" is not so good.  Suppose you move your bishop twice in the search path?
>>You evaluate that piece/square entry _twice_.  I evaluate it _once_.
>
>This is not true.
>
>Suppose we have
>  Bd2 a4 Bc1 a3 Kg2
>
>Now search lines are like:
>  Bd2 a4 Bc1 a3 Kg2
>  Bd2 a4 Bc1 a3 Kg1
>  Bd2 a4 Bc1 a3 Kg3
>  Bd2 a4 Bc1 a3 Kh3
>  Bd2 a4 Bc1 a3 Kh2
>  Bd2 a4 Bc1 a3 Kh1
>  Bd2 a4 Bc1 a3 Kf1
>  Bd2 a4 Bc1 a3 Kf2
>
>So if you can do something easy as PSQ incremental
>that's hell faster.
>



Just do the math.  I gave the position for fine 70.  Anybody can search
to a depth of 20 plies almost instantly.  That is 20 king moves/evals by
you, along _any_ path.  I will do 2 king evals at the tips of any path.

That was my point.  It is not _nearly_ so bad as you think.  Certainly not
even 20% faster best case as that won't eliminate 40% of my evaluation code
since piece/square tables are not even 1% of what I do there.



>I agree with you that when things get hard to have an overview
>over that it's getting tougher.
>
>But problem here is that you also SUFFER in evaluation unneccesarily
>2 branch misprediction penalties from around 20 to 25 clocks in this case
>(worst case because branch is a tough branch, so not the minimum of 10
>clocks each). So that's a waste of 50 clocks in all the above search
>lines, ONLY for bishops.

I'm not sure what you mean about branch mispredictions when talking about the
piece square stuff.  there are no branches associated with that.



>
>Not to mention the clocks it takes to count them!
>
>>The deeper the depth, the less effective that incremental
>>stuff is.  Just try
>>it on Fine #70 for example.  I'll toast you good there...
>
>No that's not true at all. If you search that deep then the only
>thing which matters is good evaluation. I do loads more as you do :)


As I said, don't confuse quantity with quality.  :)


>
>>>
>>>Overall i'm amazed crafty plays that strong with so little evaluation!
>
>>>Probably tuning of it has been done at a very professional level!
>
>>>Best Regards,
>>>Vincent
>>
>>
>>Don't confuse "quantity" with "quality".
>
>In chess both count. I have both.

If you think I don't, there is a simple way to evaluate this.  A long series
of games on equal hardware on ICC.  I don't think what I am doing is "bad" by
any stretch.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.