Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CCRL 40/40 Rating list and stats updated

Author: Kirill Kryukov

Date: 18:47:28 02/20/06

Go up one level in this thread


On February 20, 2006 at 18:07:49, Uri Blass wrote:

>If the program play enough games it is possible to see the average result as
>function of the evaluation.
>
>if you take all cases that it has evaluation of 1.24 pawns for itself then it
>has some average result in it so the average result is function of the
>evaluation.

I see what you mean. Yes, I think this method can solve the problem of
unreliable evaluations. I only suspect that such computation will require much
more games to be reilable. You can get the evaluation every move, but the result
is once in a game only.


>The main problem with it is that the program may play against stronger opponent
>or weaker opponent so the number may be misleading even if you have enough cases
>with evaluation of 1.24 pawns so a possible idea that I can think of is to make
>long match between programs with equal strength and use the games to decide
>about the expected result.

Yes, this is one difficulty. I think it should be possible to fix it like this:
When we want to compare evaluation of engine A with engine B, we first find all
engines in the database that played with both A and B. Then we extract all such
games (Between A or B, and an engine that played with both A and B) and compute
the result as function of evaluation. Only after that we compare evaluation of A
and B. Should work I think. I will give it a try when I have time.

There is one more possibility to break this method. There is now discussion in
another thread about "asymmetric" approach - an engine may evaluate position
differently depending with whom it is playing. In such case this method will
still not work reliably.


>Another idea is simply to give the program to analyze many games of players with
>similiar strength(not of itself) ao it has enough cases that the evaluation says
>+1.24 and calculate the average result in these cases.

This is possible, but will require much longer time - first to play the games,
then to analyze them with each program. I will still focus on extracting the
information just from the game database.



>Uri


Thanks for suggestions!

Best,
Kirill



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.