Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: This Super Laptop with Fritz 8 would even beat Judith Polgar!

Author: Dann Corbit

Date: 16:37:15 06/15/04

On June 15, 2004 at 18:40:13, Peter Fendrich wrote:

>On June 15, 2004 at 17:10:43, Dann Corbit wrote:
>
>>On June 15, 2004 at 16:53:16, Peter Fendrich wrote:
>>
>>>On June 15, 2004 at 16:39:20, Dann Corbit wrote:
>>>
>>>>On June 15, 2004 at 16:20:45, Peter Fendrich wrote:
>>>>
>>>- snip -
>>>
>>>>>We could in fact invent a much better rating system for chess engines. The ELO
>>>>>system is designed for humans with a sparse number of games and not for hundreds
>>>>>and thousands of games in long matches. But it works.
>>>>>IMHO it's however not very practical with another rating system when the ELO
>>>>>system is the chess rating standard.
>>>>
>>>>There used to be a nice web site by Royal C. Jones on alternative Elo
>>>>calculation methods.  I am no longer able to find it.
>>>>
>>>>Here is a C++ program that performs his alternate calculations in a simulation:
>>>>ftp://cap.connx.com/pub/tournament_software/prog10.cpp
>>>>
>>>>Here is the letter where I asked his permission to use the code:
>>>>ftp://cap.connx.com/pub/tournament_software/Re%20Your%20chess%20rating%20systems.txt
>>>
>>>Yes, I think I once got the link from you:
>>>http://ourworld.cs.com/royjones1999/index.htm
>>>I think with another system it could be done even better for chess engines:
>>>- they don't vary their strength during time like humans
>>>- one can easily play a huge number of games
>>>
>>>with use of Bayeesian alg's...
>>
>>I think the best thing about computer modelling is that we do not necessarily
>>need to assume a gaussian curve.  We could fit as many models as we like, and
>>then choose the one that turns out to be the best predictor.
>>
>>It is clear than when Elo figures are drastically different (e.g. 1000 Elo) that
>>the model predicts poorly.
>>
>>Even with moderate difference levels (plan an engine against a pool of peer
>>players, play the same engine against a pool of players 100 Elo below, play the
>>engine with both pools combined) you will see unexplained differences.
>
>How do you know that the pools are so different?

Because it is not the first time they programs have played against each other.
I have a good idea of their Elo before-hand.

>Another thing, two small pools could give strange results. "The A always loses
>vs B but have a higher rating" problem will give such effects with small pools.

Here is the scenario:
The tournament is a round-robin with a very large number of opponents.  Each
phase of the round robin starts with one player who plays white and then black
against all the other opponents.  In the first few passes of the programs, there
were a large number of very strong programs.  Now, the strong programs will have
some sort of provisional rating after 25 sets of gauntletts have been run, since
they will have played 50 games against 25 different opponents.  But the average
Elo in this first set of programs is much higher than the average for the entire
pool.  What I am seeing is that each new strong program (which had yet to take
its turn against the entire pool) drops in Elo a bit when it faces all the
programs.  This indicates to me that playing against stronger opposition gives a
deflated view of the Elo (or conversely, that playing weaker opposition gives an
inflated one).  My notion is confirmed by what I see in other tournaments where
opposition strength comes in levels.  For instance, look at a program like
SlowChess as it marches through George Lyapko's tournament.  Against the early
opposition (of clearly known strength) it has a very high rating.  But as it
faces stronger and stronger opposition, the Elo rating drops.  So, it might be
that you can inflate your Elo rating by playing a group that is 100 Elo below
your level, as compared to playing a group that is your peer.

Most of this is just heuristic guessing.  I have not done a careful study yet.

>>As an example, I am running a contest with about 6000 games played so far.  When
>>a large number of games has been played by some engine (e.g. 200 games) then the
>>rating clearly is changed compared to when it had a smaller number of games
>>against tougher competition.  The effect becomes pronounced when you see players
>>of very high Elo take on players of very low Elo.  The stronger players
>>basically cannot earn any points, no matter what happens.
>This is a typical chess engine problem. The Elo system is explicitly designed
>for people playing in the same class with about 200 points spread. IIRC that
>problem arises already with 400 points differences and many games. Something
>that never happens for people.

What about a prodigy on the way up?

>Allowing decimals will make it handle wider difference to a certain level.
>
>/Peter

Re: This Super Laptop with Fritz 8 would even beat Judith Polgar! Peter Fendrich 11:20:13 06/16/04
- Re: This Super Laptop with Fritz 8 would even beat Judith Polgar! Dann Corbit 13:03:02 06/16/04
  - Re: This Super Laptop with Fritz 8 would even beat Judith Polgar! Peter Fendrich 13:01:33 06/17/04
    - Re: This Super Laptop with Fritz 8 would even beat Judith Polgar! Dann Corbit 18:07:24 06/17/04

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.