Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Engine v engine tests not reliable

Author: Phil Dixon

Date: 12:14:02 06/15/99

Go up one level in this thread


On June 15, 1999 at 13:53:22, Robert Hyatt wrote:

>On June 15, 1999 at 13:02:48, Dann Corbit wrote:
>
>>On June 15, 1999 at 12:31:28, eric guttenberg wrote:
>>
>>>Today I read a post here by Jim Walker in which he reports 80 games at g/5min
>>>between Fritz 5.32 and Hiarcs 7.32 with virtually equal results.  Significantly
>>>this was not engine v engine, and suggests that all these engine v engine
>>>results where H7.32 wins about 80% of the points are not valid. Again and
>>>again there are posts suggesting that the engine v engine results are not
>>>reliable and again and again these cautions are flung down and danced upon
>>>by repeated postings claiming a huge superiority by H7.32 over F5.32 based on
>>>yet another engine v engine match.
>>
>>But when we use the native interface, it is nothing but a shiny cover over the
>>engine.  If the engine is superior, then how is changing the interface going to
>>alter anything?  If it does then there is some kind of bug, or the human running
>>the programs and entering the moves manually is doing something wrong.
>>
>>>The non-engine v engine results I have seen suggest that Hiarcs' superiority
>>>over Fritz is significant but not overwhelming and in the case of g/5 min
>>>blitz games, F5.32 may be as good or better.
>>A carefully controlled scientific experiment can confirm or deny the hypothesis.
>
>
>The "interface" isn't always "just a shiny cover".  The interface has to switch
>between the two engines to let them play, and this can be very disruptive if not
>done right...
>
>Best approach is two computers, two interfaces, one operator between them... or
>something like auto232.  Testing on the _same_ computer is simply not a valid
>test...

I agree completely, Bob.  The games I have posted are simply there for the
viewing pleasure and people may draw their own conclusions.  Is any one program
better than another at the top levels?   Little difference between them, IMO.  I
include Crafty at the top levels, also.

Regards,
Phil



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.