Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How Should SSDF Recalibrate

Author: Bertil Eklund

Date: 14:18:38 05/23/00

Go up one level in this thread


On May 23, 2000 at 15:38:19, Ratko V Tomic wrote:

>> Clearly SSDF needs to recalibrate their rating downward
>> by a 200 or so points.
>
>While I agree that SSDF could use recalibration to human
>ELO, the problem is not that the SSDF ELO is always too high.
>For the top programs their rating is too high, for the bottom
>ones it is too low when compared to human players. So the
>problem is that it exaggerates the ELO difference between
>programs (relative to human players).
>
>The reason for this magnified ELO difference between the programs
>is that they all play using the algorithms with essentially
>the same primary strengths (short range tactics, tuned opening
>books, table-bases) and weaknesses (strategy, long range tactics).
>In effect the programs are competing in a reduced dimensionality
>space (the dimensions being various aspects of play) compared
>to humans who individually vary in more dimensions.
>
>Continuing with this picturesque analogy (with random walks),
>for the number N of "improvement steps" of program A vs B
>(assuming "steps" in 1-Dimension, i.e. 1 aspect, speed), the
>program A will drift N units of strength distance away from B.
>But if the N improvement steps are in 2-Dimensions, and say N=2,
>then the average distance in 2 steps is not 2 strength units
>any more but (2+sqrt(2)+2)/3=1.80, i.e. less than for the 2
>improvement steps in 1 Dimension. Similarly, in 3-D the 2
>improvement steps yield strength distance of 1.707
>[i.e. (2+2+2+3*sqrt(2))/6]. So, for a given number of
>"improvement steps" the more different strength aspects
>one varies between players the smaller will be the average
>strength distance between the players.
>
>Perhaps, a simple fix SSDF could try is to recalculate
>their ratings from their existent database of comp-comp
>results but using a factor smaller than 400 (200-300 may
>work better) when calculating the rating difference from
>the comp-comp results. A better factor (than 400) for the
>comp-comp could then be extracted (via least squares)
>from the known human-computer results at the both ends
>of the program strength spectrum.
>
>Once a good factor for comp-comp scores is found, the SSDF
>would stay in sync with the human ELO for much longer than
>their current scheme.
>
>Of course, a drastically new kind of a chess program or
>a development of a more systematic anti-computer strategy
>by human players would require further branching in the
>correct ELO factors.
Hi!

Out on deep water again... trying to find a statistical or matematical
explanation for this.

Everyone that has followed this understand the reason for this.
Today almost every player plays and analyzes with computers all day long.
Today people understand the way to play them. Simple as this.
The program that probably suffers most from the above facts is Fritz because
almost every chessplayer use Fritz, not necessarily because it's better than
other programs but it is very strong tactically and you have chessbase excellent
database functions.

Bertil SSDF



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.