Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Junior's long lines: more data about this....

Author: Don Dailey

Date: 23:24:09 12/31/97

Go up one level in this thread


On December 31, 1997 at 19:00:58, Christophe Theron wrote:

>
>On December 28, 1997 at 23:38:12, Don Dailey wrote:
>
>>I did a really interesting study once several years ago.  I took
>>a small problem set and adjusted the weights to predict the Swedish
>>ratings of several programs.  You can use various methods to do
>>this, I used a genetic algorithm.  I was able to come up with a
>>formula which was very accurate, within about 10 points for ANY
>>program that was involved in the test.
>>
>>This test should probably be repeated.  It should involve as many
>>accurately rated programs as possible.  The opening book hacks that
>>some programmers may be using could hurt the accuracy of this
>>test though since there is a possibility the book is the main
>>source of the programs strength.
>>
>>To be really accurate I think it's a mistake to only count total
>>problems solved.  Time of solution should be a factor.  Because
>>this is meaningful information that should not be thrown away.
>>
>>But it turns out this is the simplest thing to do, it's much harder
>>to construct a good scoring function for problem sets that take
>>time into account and allows you to not solve some problems.
>>
>>-- Don
>
>Did you ever heard about the Louguet Chess Test 2 (LCT-II) by french
>journalist/programmer Frederic Louguet? It is a set of 36 positions
>including "positional", "tactical" and "endgames" that you can use to
>measure the strength of any program. It takes the time used to solve
>each problem (max 10 minutes), uses a simple rating table, and gives you
>the "SSDF" ELO of the program. Surprisingly, it works very well, and
>gives generally the right ELO for each known program (on each know
>platform) within a 20 points margin.
>
>The french magazine "La Puce Echiquéenne" has published for several
>years results for many well known programs. The main advantage of this
>test is that you can have a pretty good idea of the strength of any
>program in less than 3 hours. New programs have been rated by the french
>revue months before the SSDF list mentionned them.
>
>I suppose Louguet used a statistical method to build the test (in fact
>to get the subset from a very large set of positions that gives the
>closest match to the SSDF ranking).
>
>LCT is accurate if you don't use it to improve your program. For
>example, I have found some changes that gave Tiger a near 2600 "ELO" (on
>PII-300MHz). But in games, this version is very weak.
>
>
>    Christophe

I'm very interested in getting this set.  If I do I will not pay any
attention to them and will not try to understand them, just run them!

Can you tell me where to get them?


-- Don


I did another intersting test once.   I took a randomized database of
positions with master moves and noted the master responses.  I used
a huge sample of about 20 thousand positions.  I tested on 2, 3, 4,
5, etc plys just to see how often Socrates matched the master move.
I found a very nice smooth improvement with depth.   I thought finally,
maybe this is a decent way to measure improvement!  I would get 100's
more problems on each level jump.

So then I decided to turn off all the big pawn structure stuff and try
the test.  I self tested thoroughly to verify that pawn structure was
indeed a MAJOR source of strength in Socrates, it was worth perhaps
100 rating points or more.

The results at a given depth came out virtually the same!   I  was
completely baffled.   I didn't check into this too much further but
my hypothesis now is that there is no concept of "weighting" here.
Not playing a master move is not the same as making a horrible pawn
structure error and this test gives them the same weight.


-- Don



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.