Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Diep and Falcon #2 and 3

Author: Chessfun

Date: 22:09:54 04/30/04

Go up one level in this thread


On May 01, 2004 at 00:28:48, Dann Corbit wrote:

>On April 30, 2004 at 22:44:40, Chessfun wrote:
>
>>
>>Diep is now in the #3 programs
>>http://www.talkchess.com/forums/1/message.html?362447
>>
>>And Falcon is a Grandmaster strength program about 2700 ELO.
>>
>>And assuming "Shredder 8 is the only engine that consistently scores above 50%
>>against Falcon in my tests"
>>http://www.talkchess.com/forums/1/message.html?362348 we can therefore assume
>>it's #2
>>
>>That leaves Shredder 8 at #1.
>>
>>Lucky both the #2 and #3 program are neither for sale or available else some may
>>even report they are #1 ;-)
>>
>>I would suggest to both programmers that they get a good team of beta testers
>>and start posting game scores and results that would be deemed realistic.
>
>I don't think either post makes that claim, while at the same time I do
>understand that people will extrapolate it in that way.
>
>The tests Joe runs in his basement will have a very high degree of uncertainty.
>It is not at all unusual for someone to run ten or twenty games and then make
>judgements from that.  If the result is very lopsided, it is not even a terrible
>idea to do it.
>
>My point is that there have been private engines that were extraordinarily
>strong.  Ferret springs to mind.  At one time, it may bave been one of the top
>three engines in the world (no idea now, since it does not even seem to play any
>more).
>
>How strong are Diep and Falcon?  Quite frankly, nobody knows.  I do not believe
>that the experimenters are telling stories.  I do believe that chances are good
>that the data volume is low and the uncertainty is high.

Vincent seems to know ;-)

>Since you run so many extensive tests, I am very sure that you have seen a
>number of interesting reversals.
>
>We also do not know what kind of tests are being performed.  Perhaps (for
>instance) an engine does incredibly well in middle game results.  If EPD test
>suites are run against middle game positions, an appearance of sizeable
>superiority could emerge.  But with a buggy book, will the engine get there?
>And will the engine squander it in the endgame?
>
>Until there are a large number of controlled tests by neutral parties for which
>there is public access to the data, we can do nothing more than speculate.
>
>And there is always the possibility that Diep and Falcon _are_ the second and
>third strongest engines.  The only way to know is to accumulate the data.

I think that is about what I said in the last paragraph. Though honestly....I've
heard it all before.

Sarah.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.