Author: Dann Corbit
Date: 16:37:15 06/15/04
Go up one level in this thread
On June 15, 2004 at 18:40:13, Peter Fendrich wrote: >On June 15, 2004 at 17:10:43, Dann Corbit wrote: > >>On June 15, 2004 at 16:53:16, Peter Fendrich wrote: >> >>>On June 15, 2004 at 16:39:20, Dann Corbit wrote: >>> >>>>On June 15, 2004 at 16:20:45, Peter Fendrich wrote: >>>> >>>- snip - >>> >>>>>We could in fact invent a much better rating system for chess engines. The ELO >>>>>system is designed for humans with a sparse number of games and not for hundreds >>>>>and thousands of games in long matches. But it works. >>>>>IMHO it's however not very practical with another rating system when the ELO >>>>>system is the chess rating standard. >>>> >>>>There used to be a nice web site by Royal C. Jones on alternative Elo >>>>calculation methods. I am no longer able to find it. >>>> >>>>Here is a C++ program that performs his alternate calculations in a simulation: >>>>ftp://cap.connx.com/pub/tournament_software/prog10.cpp >>>> >>>>Here is the letter where I asked his permission to use the code: >>>>ftp://cap.connx.com/pub/tournament_software/Re%20Your%20chess%20rating%20systems.txt >>> >>>Yes, I think I once got the link from you: >>>http://ourworld.cs.com/royjones1999/index.htm >>>I think with another system it could be done even better for chess engines: >>>- they don't vary their strength during time like humans >>>- one can easily play a huge number of games >>> >>>with use of Bayeesian alg's... >> >>I think the best thing about computer modelling is that we do not necessarily >>need to assume a gaussian curve. We could fit as many models as we like, and >>then choose the one that turns out to be the best predictor. >> >>It is clear than when Elo figures are drastically different (e.g. 1000 Elo) that >>the model predicts poorly. >> >>Even with moderate difference levels (plan an engine against a pool of peer >>players, play the same engine against a pool of players 100 Elo below, play the >>engine with both pools combined) you will see unexplained differences. > >How do you know that the pools are so different? Because it is not the first time they programs have played against each other. I have a good idea of their Elo before-hand. >Another thing, two small pools could give strange results. "The A always loses >vs B but have a higher rating" problem will give such effects with small pools. Here is the scenario: The tournament is a round-robin with a very large number of opponents. Each phase of the round robin starts with one player who plays white and then black against all the other opponents. In the first few passes of the programs, there were a large number of very strong programs. Now, the strong programs will have some sort of provisional rating after 25 sets of gauntletts have been run, since they will have played 50 games against 25 different opponents. But the average Elo in this first set of programs is much higher than the average for the entire pool. What I am seeing is that each new strong program (which had yet to take its turn against the entire pool) drops in Elo a bit when it faces all the programs. This indicates to me that playing against stronger opposition gives a deflated view of the Elo (or conversely, that playing weaker opposition gives an inflated one). My notion is confirmed by what I see in other tournaments where opposition strength comes in levels. For instance, look at a program like SlowChess as it marches through George Lyapko's tournament. Against the early opposition (of clearly known strength) it has a very high rating. But as it faces stronger and stronger opposition, the Elo rating drops. So, it might be that you can inflate your Elo rating by playing a group that is 100 Elo below your level, as compared to playing a group that is your peer. Most of this is just heuristic guessing. I have not done a careful study yet. >>As an example, I am running a contest with about 6000 games played so far. When >>a large number of games has been played by some engine (e.g. 200 games) then the >>rating clearly is changed compared to when it had a smaller number of games >>against tougher competition. The effect becomes pronounced when you see players >>of very high Elo take on players of very low Elo. The stronger players >>basically cannot earn any points, no matter what happens. >This is a typical chess engine problem. The Elo system is explicitly designed >for people playing in the same class with about 200 points spread. IIRC that >problem arises already with 400 points differences and many games. Something >that never happens for people. What about a prodigy on the way up? >Allowing decimals will make it handle wider difference to a certain level. > >/Peter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.