Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Junior - Crafty NPS Challenge - a user experiment

Author: Robert Hyatt

Date: 07:10:08 11/25/03

Go up one level in this thread


On November 25, 2003 at 06:57:05, Sune Fischer wrote:

>On November 24, 2003 at 23:19:27, Robert Hyatt wrote:
>>
>>
>>Yeah, but when comparing humans, there is no decimel point in my
>>comparisons.  RE: Fide ratings.
>>
>>:)
>
>There are two questions here:
>
>A) how strong are the two engines in relation to eachother?
>B) which of the two engines are stronger?
>
>You want to know A, you always want to know A, and therefore you care very much
>about #draws.
>Knowing A is much better than knowing B, by knowing A we also know B!
>
>Knowing B however is 'enough' and much easier to answer in general.

There we don't agree.  Perhaps in the case of computers, you play 1000
draws and then lose a single game.  Later you discover that you lost
because the hardware was screwing up on an ADD for a specific combination
of 2's complement negative numbers.  And produced a positive result (this
happened to me on a Cray, once).  But that loss makes you look worse.  When
the 1000 draws suggests equality.

>
>The only thing that keeps this from being really interesting in practise, IMO,
>is that selfplay matches do not necessarily give a valid picture of strength
>relations, which means answering B isn't enough to see if there is progress.
>By playing against a third alien engine we are going to need an answer to A to
>see if there is any progress.
>

I don't think that is good enough.  The "alien engine" might not hit on a
weakness in the new version that the old version does not have.  IE if you
play against an engine with weak king safety, your endgame bug might never
cause a problem since you mop him up tactically due to the exposed king.




>-S.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.