Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Contrast in playing strength.

Author: Mark Young

Date: 18:33:14 08/31/98

Go up one level in this thread


On August 31, 1998 at 19:46:39, Robert Hyatt wrote:

>On August 31, 1998 at 17:57:02, Mark Young wrote:
>
>>Is computer Vs computer testing now useless in gauging a chess program s
>>strength playing humans? When Crafty gets killed playing Junior 5 by a wide
>>margin. And Fritz 5 draws a match with Rebel 10 even when Rebel 10 has a 2x
>>hardware advantage. Is it time to abandon Computer Vs Computer testing all
>>together? Or are we going to have two standards to judge chess programs? One
>>chess program being the best playing other chess programs and one chess program
>>being the best playing humans. And if so what is the best standard to judge a
>>programs overall strength? Is it better marketing to show you can destroy all
>>other programs like Junior5 and Fritz 5 can do, or is it better to show you can
>>beat a top grandmaster like Rebel 10 can do?
>
>
>As I've said many times, you are talking about *two* different games at
>present.  As an example, take CSTal, which might do very well against a
>human with its speculative/complicated style, but which does very badly
>against fast searchers.  If you were to measure CSTal's worth by only
>playing against fast programs, you might toss it out.  If you only measured
>it by playing against humans, you might decide it is the best there is.  In
>reality, both answers (or neither answer) could be right...

I know, but the gulf between computer Vs computer results and computer Vs human
results has now reach a point of absurdity. Take your program for example.
Crafty went –16, =4, +0 playing Junior 5. If you add the games I played with
Fritz 5 also a fast searcher the results are –21, =4, +0 for Crafty. That would
suggest a rating difference of 392 points. Which would make Crafty only an
expert rated chess player. Crafty is not the only program that suffers for this.
M-chess pro has taken a big hit in Ed’s testing. Rebel has also but to a lesser
extent. The point being with the result so skewed now. Is computer Vs computer
testing more harmful then helpful to chess programmer and the buying public,
unless the goal of chess programmers is now to just try and beat each other and
the hell with the consumer and how the program performs when playing people.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.