Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Test suites - can they reliably predict ELO?

Author: Peter Fendrich

Date: 12:16:00 12/12/99

Go up one level in this thread


On December 11, 1999 at 17:52:56, Tom King wrote:

>Which of the well known test suites predicts the strength of chess programs most
>accurately?
>
>I ask this, because I recently made some *slight* mods. to the evaluation
>function in my program, Francesca. I ran the LCT-2 suite, and the results
>indicated that it was a wash - the modification gave me about 5 ELO points,
>apparently.
>
>I then ran a series of fast games against another amateur program. I realize
>it's important to play a large number of games, to reduce the margin of error,
>so I ran two matches of 65 games. The result was this:
>
>MATCH 1
>"Normal" Francesca scored 37% against the amateur program.
>
>MATCH 2
>"Modified" Francesca scored 45% against the amateur program.
>
>Quite a difference! It implies that the modification is worth over 50 ELO. I
>guess I need to play more games, against a variety of programs to verify whether
>this improvement is real, or imaginary.
>
>Anyhow, beware of reading too much into ELO predictions of test suites..
>
>Cheers All,
>Tom

Hi Tom.
As most of the posts already have said, there are no test suites that computes
ELO within reasonable limits. I think it's almost impossible to create one and
if it is possible, it would neeed several thousands of positions.

If you played only one program and computed ELO from that, it is as uncertain as
any test suite. If your program is specially bad or well suited for the
opponent. It doesn't help to play more games, which will only fortify the warped
diffefernce. You neeed different opponents with different styles and a lot of
games alltogether in order to get a reliable estimation of ELO diffrences.
Sofar, 5 ELO or 50 ELO difference, are both equally bad estimations.
//Peter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.