Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CM6 and the relevance of LCTII

Author: Bernhard Bauer

Date: 07:44:01 05/18/99

Go up one level in this thread


On May 18, 1999 at 10:11:45, Chris Carson wrote:

>On May 17, 1999 at 13:32:57, Goette Patrick wrote:
>
>>Hello,
>>
>>As I am a new user of Chess Master 6000 ( I never had a former version of CM
>>before ) I would like to know, according to your experience with this software,
>>if CM has a particular reputation of being weak in the positional play, in
>>opposition with its tactical play and its force in the finals.
>>Let me explain : I just tested 3 programs with the chess test LCTII ( of
>>Louguet, a french author ). Here are the results I obtained on my 166 Mz Pentium
>>with 32 Mo Ram (hashtables homogeneized) :
>>
>>                      Virtual Chess 1.02    Hiarcs 6.0   CM6000
>>
>>Positional play :     200/420 pts= 47.6%    210/420=50%    145/420=34.5%
>>
>>Tactical play :       195/360 = 54%         235/360=65.3%  280/360=77.8%
>>
>>Finals :              60/270=22%            95/270=35.2%   120/270=44.4%
>>
>>TOTAL                 2355 ELO(according to LCTII) 2440 ELO  2445 ELO (!)
>>
>>
>>Well my questions are these :
>>
>>How do you explain the relative weakness of CM600O performance in positional
>>play in comparison with the 2 other programs ?
>>
>>What is the relevance of LCTII test to evaluate the strength of programs, when
>>we know the true ELO rating of CM6000 is at least 100 points higher ?
>>
>>What are the tests  the more relevant to evaluate the true ELO-rating of a chess
>>program ? ( I mean whose results get close to the ELO estimated through many and
>>many games played as usual ) ?
>>
>>Every remark, opinion, explanation welcome.
>>Patrick Goette
>>patrick.goette@smile.ch
>
>Most testsuites are good for the machine calibrated to.  The problem is
>when you move away from that machine the time limits may not provide an
>accurate rating (although still useful information about evaluation and
>searching can be made).
>
>I see these problems with most test suites:
>
>1.  some problems are to easy to solve (< 1min, PII300).  Small
>    solution time makes scalability to faster machines difficult.
>    and makes it difficult to distinguish between programs.
>
>2.  some problems are to hard to solve (> 3min, PII300).  Large
>    solution time make scalability to slower machines difficult and
>    makes it difficult to distinguish between programs.
>
>3.  Wide range of problems difficult to identify.  Need clear cut
>    best solution, but hidden several ply deep.  Need different types
>    for quiet positions (no clear tactics), clear combinations, endings.
>
>My test suite requirements look like this:
>
>1.  36 epd positions.
>2.  12 quiet positions, 12 combinations, 12 endings
>    with clear best moves or clear alternate
>    moves.
>3.  1 to 3 min avg solution times on PII 300 for top 10
>    commercial programs.  This would make the total test
>    time 36 to 108 minutes on a PII 300 machine and would
>    increase for slower machines and decrease for faster
>    machines.  Produce wide variation among different searching
>    and evaluation styles.
>4.  Known PV's for 6 ply.
>5.  Let the program/hardware run to solution time (hold for 3 ply).
>    Use a log formula to calculate a rating (Kaufman proposed
>    2930-200log(T), where T is total solution time) something like this
>    would produce good rating estimates on slower and faster
>    machines (equation would need to be calibrated across 3 different
>    machines, dx-66, P90, P200MMX could be used to calibrate with
>    SSDF list and verified against PII300 results).

A formula of this type is not a good idea because T may become infinite.
Otherwise you have to limit the solution time.
So I would propose a formula of this type:

    Melo = BaseElo + EloRange *  ( 1 / NumberOfTestcases) *

                     sum_i { 1 / ( 1 + time_i / ReferenceTime )
Where
    Melo         : meaningless elo number
    BaseElo      : lowest elo number that can be achieved by this test
    EloRange     : the range of this test
    sum_i        : sums up {} for each position
    time_i       : the time needed to solve position i
    ReferenceTime: a reference value for this test. For example
                   ReferenceTime = 300 would add .5 to the sum, if this problem
                   is solved in 5 min.


>6.  You should always buy based on actual games played, this will give
>    a true rating, the test is just an estimate and provides information
>    about searching/evaluation weakness and speed.
>
>This is just my opinion, I am working on a test suite for my own program
>that has these characteristics.
>
>Best Regards,
>Chris Carson

Kind regards

Bernhard



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.