Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Junior's long lines: more data about this....

Author: Christophe Theron

Date: 16:00:58 12/31/97

Go up one level in this thread



On December 28, 1997 at 23:38:12, Don Dailey wrote:

>I did a really interesting study once several years ago.  I took
>a small problem set and adjusted the weights to predict the Swedish
>ratings of several programs.  You can use various methods to do
>this, I used a genetic algorithm.  I was able to come up with a
>formula which was very accurate, within about 10 points for ANY
>program that was involved in the test.
>
>This test should probably be repeated.  It should involve as many
>accurately rated programs as possible.  The opening book hacks that
>some programmers may be using could hurt the accuracy of this
>test though since there is a possibility the book is the main
>source of the programs strength.
>
>To be really accurate I think it's a mistake to only count total
>problems solved.  Time of solution should be a factor.  Because
>this is meaningful information that should not be thrown away.
>
>But it turns out this is the simplest thing to do, it's much harder
>to construct a good scoring function for problem sets that take
>time into account and allows you to not solve some problems.
>
>-- Don

Did you ever heard about the Louguet Chess Test 2 (LCT-II) by french
journalist/programmer Frederic Louguet? It is a set of 36 positions
including "positional", "tactical" and "endgames" that you can use to
measure the strength of any program. It takes the time used to solve
each problem (max 10 minutes), uses a simple rating table, and gives you
the "SSDF" ELO of the program. Surprisingly, it works very well, and
gives generally the right ELO for each known program (on each know
platform) within a 20 points margin.

The french magazine "La Puce Echiquéenne" has published for several
years results for many well known programs. The main advantage of this
test is that you can have a pretty good idea of the strength of any
program in less than 3 hours. New programs have been rated by the french
revue months before the SSDF list mentionned them.

I suppose Louguet used a statistical method to build the test (in fact
to get the subset from a very large set of positions that gives the
closest match to the SSDF ranking).

LCT is accurate if you don't use it to improve your program. For
example, I have found some changes that gave Tiger a near 2600 "ELO" (on
PII-300MHz). But in games, this version is very weak.


    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.