Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Diep and Falcon #2 and 3

Author: Dann Corbit

Date: 21:28:48 04/30/04

Go up one level in this thread


On April 30, 2004 at 22:44:40, Chessfun wrote:

>
>Diep is now in the #3 programs
>http://www.talkchess.com/forums/1/message.html?362447
>
>And Falcon is a Grandmaster strength program about 2700 ELO.
>
>And assuming "Shredder 8 is the only engine that consistently scores above 50%
>against Falcon in my tests"
>http://www.talkchess.com/forums/1/message.html?362348 we can therefore assume
>it's #2
>
>That leaves Shredder 8 at #1.
>
>Lucky both the #2 and #3 program are neither for sale or available else some may
>even report they are #1 ;-)
>
>I would suggest to both programmers that they get a good team of beta testers
>and start posting game scores and results that would be deemed realistic.

I don't think either post makes that claim, while at the same time I do
understand that people will extrapolate it in that way.

The tests Joe runs in his basement will have a very high degree of uncertainty.
It is not at all unusual for someone to run ten or twenty games and then make
judgements from that.  If the result is very lopsided, it is not even a terrible
idea to do it.

My point is that there have been private engines that were extraordinarily
strong.  Ferret springs to mind.  At one time, it may bave been one of the top
three engines in the world (no idea now, since it does not even seem to play any
more).

How strong are Diep and Falcon?  Quite frankly, nobody knows.  I do not believe
that the experimenters are telling stories.  I do believe that chances are good
that the data volume is low and the uncertainty is high.

Since you run so many extensive tests, I am very sure that you have seen a
number of interesting reversals.

We also do not know what kind of tests are being performed.  Perhaps (for
instance) an engine does incredibly well in middle game results.  If EPD test
suites are run against middle game positions, an appearance of sizeable
superiority could emerge.  But with a buggy book, will the engine get there?
And will the engine squander it in the endgame?

Until there are a large number of controlled tests by neutral parties for which
there is public access to the data, we can do nothing more than speculate.

And there is always the possibility that Diep and Falcon _are_ the second and
third strongest engines.  The only way to know is to accumulate the data.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.