Author: Mark Young
Date: 15:17:49 12/11/99
Go up one level in this thread
On December 11, 1999 at 17:52:56, Tom King wrote: >Which of the well known test suites predicts the strength of chess programs most >accurately? I don't know of any test suites that are reliably. They are fun to run, but you can not read much into them. There are many moves you can play in most of the test positions that can also win, or draw, but you only get credit for picking the test positions best move, and you don't get points taken away in most test for picking a really bad move in a test position. As you know your program can play 40 great moves in a game, but it can mean nothing in a game if it plays one bad move. > >I ask this, because I recently made some *slight* mods. to the evaluation >function in my program, Francesca. I ran the LCT-2 suite, and the results >indicated that it was a wash - the modification gave me about 5 ELO points, >apparently. > >I then ran a series of fast games against another amateur program. I realize >it's important to play a large number of games, to reduce the margin of error, >so I ran two matches of 65 games. The result was this: > >MATCH 1 >"Normal" Francesca scored 37% against the amateur program. > >MATCH 2 >"Modified" Francesca scored 45% against the amateur program. > >Quite a difference! It implies that the modification is worth over 50 ELO. I >guess I need to play more games, against a variety of programs to verify whether >this improvement is real, or imaginary. > >Anyhow, beware of reading too much into ELO predictions of test suites.. > >Cheers All, >Tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.