Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: General Objection Against CEGT Stats

Author: Michael Yee

Date: 06:08:01 12/07/05

Go up one level in this thread


On December 07, 2005 at 08:48:51, George Tsavdaris wrote:

>>
>>I thought that it was clear that we discussed chess strength and NOT the
>>stability of the engines over a longer testing from the mere technical view.
>>I dont know how to make it clearer.
>
> I see your point and you maybe right, and perhaps testers should divide engines
>into categories more than 1 they do now......In fact i don't think that
>"perhaps" is needed here.....

Well, given the existing database of played games, one could easily create
different sets of ratings:

(1) top engines amongst themselves

(2) individual top engine vs weaker engines

- i.e., to calculate rybka's "rating-vs-amateurs", you would form the set of
games that includes rybka, weaker amateurs, but no other top engines
- and pick some stable amateur engine to fix a rating

(3) weaker engines among themselves

----

Also, it shouldn't be too hard to check whether the current ratings (over the
whole pool) predict well. For example, let's take rybka. For each other player,
we could calculate the expected score given the current rating difference and
compare it to the actual score in the game database. Then we could see if there
was some bias with respect to top engines, weaker engines, or whatever.

Michael



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.