Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CM6 and the relevance of LCTII

Author: Chris Carson

Date: 07:11:45 05/18/99

Go up one level in this thread


On May 17, 1999 at 13:32:57, Goette Patrick wrote:

>Hello,
>
>As I am a new user of Chess Master 6000 ( I never had a former version of CM
>before ) I would like to know, according to your experience with this software,
>if CM has a particular reputation of being weak in the positional play, in
>opposition with its tactical play and its force in the finals.
>Let me explain : I just tested 3 programs with the chess test LCTII ( of
>Louguet, a french author ). Here are the results I obtained on my 166 Mz Pentium
>with 32 Mo Ram (hashtables homogeneized) :
>
>                      Virtual Chess 1.02    Hiarcs 6.0   CM6000
>
>Positional play :     200/420 pts= 47.6%    210/420=50%    145/420=34.5%
>
>Tactical play :       195/360 = 54%         235/360=65.3%  280/360=77.8%
>
>Finals :              60/270=22%            95/270=35.2%   120/270=44.4%
>
>TOTAL                 2355 ELO(according to LCTII) 2440 ELO  2445 ELO (!)
>
>
>Well my questions are these :
>
>How do you explain the relative weakness of CM600O performance in positional
>play in comparison with the 2 other programs ?
>
>What is the relevance of LCTII test to evaluate the strength of programs, when
>we know the true ELO rating of CM6000 is at least 100 points higher ?
>
>What are the tests  the more relevant to evaluate the true ELO-rating of a chess
>program ? ( I mean whose results get close to the ELO estimated through many and
>many games played as usual ) ?
>
>Every remark, opinion, explanation welcome.
>Patrick Goette
>patrick.goette@smile.ch

Most testsuites are good for the machine calibrated to.  The problem is
when you move away from that machine the time limits may not provide an
accurate rating (although still useful information about evaluation and
searching can be made).

I see these problems with most test suites:

1.  some problems are to easy to solve (< 1min, PII300).  Small
    solution time makes scalability to faster machines difficult.
    and makes it difficult to distinguish between programs.

2.  some problems are to hard to solve (> 3min, PII300).  Large
    solution time make scalability to slower machines difficult and
    makes it difficult to distinguish between programs.

3.  Wide range of problems difficult to identify.  Need clear cut
    best solution, but hidden several ply deep.  Need different types
    for quiet positions (no clear tactics), clear combinations, endings.

My test suite requirements look like this:

1.  36 epd positions.
2.  12 quiet positions, 12 combinations, 12 endings
    with clear best moves or clear alternate
    moves.
3.  1 to 3 min avg solution times on PII 300 for top 10
    commercial programs.  This would make the total test
    time 36 to 108 minutes on a PII 300 machine and would
    increase for slower machines and decrease for faster
    machines.  Produce wide variation among different searching
    and evaluation styles.
4.  Known PV's for 6 ply.
5.  Let the program/hardware run to solution time (hold for 3 ply).
    Use a log formula to calculate a rating (Kaufman proposed
    2930-200log(T), where T is total solution time) something like this
    would produce good rating estimates on slower and faster
    machines (equation would need to be calibrated across 3 different
    machines, dx-66, P90, P200MMX could be used to calibrate with
    SSDF list and verified against PII300 results).
6.  You should always buy based on actual games played, this will give
    a true rating, the test is just an estimate and provides information
    about searching/evaluation weakness and speed.

This is just my opinion, I am working on a test suite for my own program
that has these characteristics.

Best Regards,
Chris Carson



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.