Author: Tom King
Date: 13:33:18 12/12/99
Go up one level in this thread
On December 11, 1999 at 18:15:46, Bruce Moreland wrote: >On December 11, 1999 at 17:52:56, Tom King wrote: > >>Which of the well known test suites predicts the strength of chess programs most >>accurately? >> >>I ask this, because I recently made some *slight* mods. to the evaluation >>function in my program, Francesca. I ran the LCT-2 suite, and the results >>indicated that it was a wash - the modification gave me about 5 ELO points, >>apparently. >> >>I then ran a series of fast games against another amateur program. I realize >>it's important to play a large number of games, to reduce the margin of error, >>so I ran two matches of 65 games. The result was this: >> >>MATCH 1 >>"Normal" Francesca scored 37% against the amateur program. >> >>MATCH 2 >>"Modified" Francesca scored 45% against the amateur program. >> >>Quite a difference! It implies that the modification is worth over 50 ELO. I >>guess I need to play more games, against a variety of programs to verify whether >>this improvement is real, or imaginary. >> >>Anyhow, beware of reading too much into ELO predictions of test suites.. > >I don't think that even good suites such as LCT2 predict anything, especially >when you talk about 5 Elo point differentials. You can have differentials much >larger than that just because of random search differences. > >I don't know how to do the statistics but somehow I think that running two >65-game matches twice and trying to make sense out of an 8% result difference >won't work. > >bruce ok. more games against a variety of opponents at differing time levels, then. Right? Rgds, Tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.