Author: Chris Carson
Date: 07:11:45 05/18/99
Go up one level in this thread
On May 17, 1999 at 13:32:57, Goette Patrick wrote:
>Hello,
>
>As I am a new user of Chess Master 6000 ( I never had a former version of CM
>before ) I would like to know, according to your experience with this software,
>if CM has a particular reputation of being weak in the positional play, in
>opposition with its tactical play and its force in the finals.
>Let me explain : I just tested 3 programs with the chess test LCTII ( of
>Louguet, a french author ). Here are the results I obtained on my 166 Mz Pentium
>with 32 Mo Ram (hashtables homogeneized) :
>
> Virtual Chess 1.02 Hiarcs 6.0 CM6000
>
>Positional play : 200/420 pts= 47.6% 210/420=50% 145/420=34.5%
>
>Tactical play : 195/360 = 54% 235/360=65.3% 280/360=77.8%
>
>Finals : 60/270=22% 95/270=35.2% 120/270=44.4%
>
>TOTAL 2355 ELO(according to LCTII) 2440 ELO 2445 ELO (!)
>
>
>Well my questions are these :
>
>How do you explain the relative weakness of CM600O performance in positional
>play in comparison with the 2 other programs ?
>
>What is the relevance of LCTII test to evaluate the strength of programs, when
>we know the true ELO rating of CM6000 is at least 100 points higher ?
>
>What are the tests the more relevant to evaluate the true ELO-rating of a chess
>program ? ( I mean whose results get close to the ELO estimated through many and
>many games played as usual ) ?
>
>Every remark, opinion, explanation welcome.
>Patrick Goette
>patrick.goette@smile.ch
Most testsuites are good for the machine calibrated to. The problem is
when you move away from that machine the time limits may not provide an
accurate rating (although still useful information about evaluation and
searching can be made).
I see these problems with most test suites:
1. some problems are to easy to solve (< 1min, PII300). Small
solution time makes scalability to faster machines difficult.
and makes it difficult to distinguish between programs.
2. some problems are to hard to solve (> 3min, PII300). Large
solution time make scalability to slower machines difficult and
makes it difficult to distinguish between programs.
3. Wide range of problems difficult to identify. Need clear cut
best solution, but hidden several ply deep. Need different types
for quiet positions (no clear tactics), clear combinations, endings.
My test suite requirements look like this:
1. 36 epd positions.
2. 12 quiet positions, 12 combinations, 12 endings
with clear best moves or clear alternate
moves.
3. 1 to 3 min avg solution times on PII 300 for top 10
commercial programs. This would make the total test
time 36 to 108 minutes on a PII 300 machine and would
increase for slower machines and decrease for faster
machines. Produce wide variation among different searching
and evaluation styles.
4. Known PV's for 6 ply.
5. Let the program/hardware run to solution time (hold for 3 ply).
Use a log formula to calculate a rating (Kaufman proposed
2930-200log(T), where T is total solution time) something like this
would produce good rating estimates on slower and faster
machines (equation would need to be calibrated across 3 different
machines, dx-66, P90, P200MMX could be used to calibrate with
SSDF list and verified against PII300 results).
6. You should always buy based on actual games played, this will give
a true rating, the test is just an estimate and provides information
about searching/evaluation weakness and speed.
This is just my opinion, I am working on a test suite for my own program
that has these characteristics.
Best Regards,
Chris Carson
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.