Author: Michael Yee
Date: 06:08:01 12/07/05
Go up one level in this thread
On December 07, 2005 at 08:48:51, George Tsavdaris wrote: >> >>I thought that it was clear that we discussed chess strength and NOT the >>stability of the engines over a longer testing from the mere technical view. >>I dont know how to make it clearer. > > I see your point and you maybe right, and perhaps testers should divide engines >into categories more than 1 they do now......In fact i don't think that >"perhaps" is needed here..... Well, given the existing database of played games, one could easily create different sets of ratings: (1) top engines amongst themselves (2) individual top engine vs weaker engines - i.e., to calculate rybka's "rating-vs-amateurs", you would form the set of games that includes rybka, weaker amateurs, but no other top engines - and pick some stable amateur engine to fix a rating (3) weaker engines among themselves ---- Also, it shouldn't be too hard to check whether the current ratings (over the whole pool) predict well. For example, let's take rybka. For each other player, we could calculate the expected score given the current rating difference and compare it to the actual score in the game database. Then we could see if there was some bias with respect to top engines, weaker engines, or whatever. Michael
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.