Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Rybka's current exe size: 4 628 480 !

Author: Uri Blass

Date: 09:27:26 01/29/06

Go up one level in this thread


On January 29, 2006 at 12:07:52, Albert Silver wrote:

>On January 29, 2006 at 11:55:59, Uri Blass wrote:
>
>>On January 29, 2006 at 10:03:02, Albert Silver wrote:
>>
>>>On January 29, 2006 at 07:12:15, enrico carrisco wrote:
>>>
>>>>Reminds me of Deep Thought -- using the hardware for the last N plies.  This
>>>>type of tactical search works real efficiently to see danger from your opponent
>>>>but less efficient in finding chances for itself (ex: Genius.)  Tactically it
>>>>makes it very strong but not so efficient in king attacks compared to Fritz or
>>>>Hiarcs.  Hence, on test positions it does slightly worse (just like Fruit.)
>>>
>>>Would that really be the reason? As you probably know, one can significantly
>>>improve its ability with test suites, by simply increasing the 'Optimism' in the
>>>outlook.
>>>
>>>                                           Albert
>>
>>Only on test suites that you need to fail high to find the move and not in test
>>suite that you need to fail low.
>>
>>I think that a poosible test to test positional understanding is the following
>>test:
>>
>>1)Use unequal time control so the result of both programs is 50%
>>2)Take all the games when there is disagreement between the programs about the
>>question which side is better(both programs evaluates the position as at least
>>0.25 pawns advantage for itself for at least 3 consecutive moves).
>>
>>3)calculate the result in the relevant games
>>
>>The program that score better in the games probably has a better positional
>>understanding.
>>
>>Uri
>
>I think that's complicated. Suppose in a position Rybka thinks it is better by
>0.40 pawns, and Fritz thinks IT is better by the same amount. In the next 3-4
>moves, Rybka's evaluation goes up, so that it is 0.60 ahead, and Fritz goes down
>to 0.25. The game is hard fought, with no clear bludners after this and ends in
>a draw. Who was right?

In this case the answer is do not know.
Maybe the position was equal and no program was right.

The main question is which program win more when the programs disagree.

Of course if one program is better player then it is going to win also part
of the cases when it has inferior positions thanks to blunder of the opponent
and it may distort the result.

This is the reason that I suggest to care that the result is going to be 50%.

I can add that it is clear that we cannot learn from a single game which program
has better evaluation but if we play enough games there is going to be also
enough games that the program disagree about evaluation.

Of course even in that case we cannot decide if the winner really had better
position because it may be that the winner won thanks to a blunder of the
opponent but the sides are of equal strength so I still expect that the better
side is going to win more than 50% of the games.

If one side is better in tactics then I expect it to do better in equal
positions thanks to tactical blunder of the opponent when the side that has
better evaluation is going to do better in positions when both sides evaluate
themselves as better.

There are more cases when programs disagree including cases when one side say a
draw and one side say an advantage but to make things more simple I only talk
about cases when both sides have advantage.

Of course it may be possible to improve the calculation and treat different to
diagreement of 0.6 pawns and disagreement of 1 pawns and to translate evaluation
of both programs to probability to win based on all games and later to look only
at part of the games(with disagreement in evaluation) and see the gap between
what happens(the real result) and the expected result for both sides and to
decide that the better evaluator is the program with the smaller gap.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.