Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: This Super Laptop with Fritz 8 would even beat Judith Polgar!

Author: Dann Corbit

Date: 13:03:02 06/16/04

Go up one level in this thread


On June 16, 2004 at 14:20:13, Peter Fendrich wrote:

>On June 15, 2004 at 19:37:15, Dann Corbit wrote:
>
>>On June 15, 2004 at 18:40:13, Peter Fendrich wrote:
>>
>>>On June 15, 2004 at 17:10:43, Dann Corbit wrote:
>>>
>>>>On June 15, 2004 at 16:53:16, Peter Fendrich wrote:
>>>>
>>>>>On June 15, 2004 at 16:39:20, Dann Corbit wrote:
>>>>>
>>>>>>On June 15, 2004 at 16:20:45, Peter Fendrich wrote:
>>>>>>
>>>>>- snip -
>>>>>
>>>>>>>We could in fact invent a much better rating system for chess engines. The ELO
>>>>>>>system is designed for humans with a sparse number of games and not for hundreds
>>>>>>>and thousands of games in long matches. But it works.
>>>>>>>IMHO it's however not very practical with another rating system when the ELO
>>>>>>>system is the chess rating standard.
>>>>>>
>>>>>>There used to be a nice web site by Royal C. Jones on alternative Elo
>>>>>>calculation methods.  I am no longer able to find it.
>>>>>>
>>>>>>Here is a C++ program that performs his alternate calculations in a simulation:
>>>>>>ftp://cap.connx.com/pub/tournament_software/prog10.cpp
>>>>>>
>>>>>>Here is the letter where I asked his permission to use the code:
>>>>>>ftp://cap.connx.com/pub/tournament_software/Re%20Your%20chess%20rating%20systems.txt
>>>>>
>>>>>Yes, I think I once got the link from you:
>>>>>http://ourworld.cs.com/royjones1999/index.htm
>>>>>I think with another system it could be done even better for chess engines:
>>>>>- they don't vary their strength during time like humans
>>>>>- one can easily play a huge number of games
>>>>>
>>>>>with use of Bayeesian alg's...
>>>>
>>>>I think the best thing about computer modelling is that we do not necessarily
>>>>need to assume a gaussian curve.  We could fit as many models as we like, and
>>>>then choose the one that turns out to be the best predictor.
>>>>
>>>>It is clear than when Elo figures are drastically different (e.g. 1000 Elo) that
>>>>the model predicts poorly.
>>>>
>>>>Even with moderate difference levels (plan an engine against a pool of peer
>>>>players, play the same engine against a pool of players 100 Elo below, play the
>>>>engine with both pools combined) you will see unexplained differences.
>>>
>>>How do you know that the pools are so different?
>>
>>Because it is not the first time they programs have played against each other.
>>I have a good idea of their Elo before-hand.
>>
>>>Another thing, two small pools could give strange results. "The A always loses
>>>vs B but have a higher rating" problem will give such effects with small pools.
>>
>>Here is the scenario:
>>The tournament is a round-robin with a very large number of opponents.  Each
>>phase of the round robin starts with one player who plays white and then black
>>against all the other opponents.  In the first few passes of the programs, there
>>were a large number of very strong programs.  Now, the strong programs will have
>>some sort of provisional rating after 25 sets of gauntletts have been run, since
>>they will have played 50 games against 25 different opponents.  But the average
>>Elo in this first set of programs is much higher than the average for the entire
>>pool.  What I am seeing is that each new strong program (which had yet to take
>>its turn against the entire pool) drops in Elo a bit when it faces all the
>>programs.  This indicates to me that playing against stronger opposition gives a
>>deflated view of the Elo (or conversely, that playing weaker opposition gives an
>>inflated one).  My notion is confirmed by what I see in other tournaments where
>>opposition strength comes in levels.  For instance, look at a program like
>>SlowChess as it marches through George Lyapko's tournament.  Against the early
>>opposition (of clearly known strength) it has a very high rating.  But as it
>>faces stronger and stronger opposition, the Elo rating drops.  So, it might be
>>that you can inflate your Elo rating by playing a group that is 100 Elo below
>>your level, as compared to playing a group that is your peer.
>>
>>Most of this is just heuristic guessing.  I have not done a careful study yet.
>
>I'ts hard for me to believe that this is a pattern but no one is perfect...
>Maybe the distance from the worst in group B to the best in group A is huge?

The shift in average Elo is only a few Elo.  However, there are some very weak
programs (1000 Elo below) in the main group.

Average Elo of a 200+ games program's opposition:
2294
Average Elo of a 64 games program's opposition:
2351
Weakest program:
1711
Strongest program:
2644

>How do you compute the ratings?

Elostat




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.