Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Is the SSDF taking a break from testing?

Author: Sune Fischer

Date: 16:27:56 07/15/05

Go up one level in this thread


On July 15, 2005 at 18:29:28, Dann Corbit wrote:
>
>The time difference is really about 4:1

Yes but they use faster machines.

Anyway the point is simply that what we call quality today is what we will call
crap in 3 years. Just like what we called quality 3 years ago is what we call
crap today.

Am I the only one who can see how rediculous that is?

>>How many people actually go over these tons and tons of automated games anyway..
>
>Not many.  I also find the games between the best programs very useful for book
>building.  I would not trust the CEGT games for that purpose.

I would prefer to use GM games, still.
Needs another few years for "the quality" to be there ;)

>>In order to construct a usable and interesting rating list priority number 1 is
>>to have enough games for a reliable rating, otherwise it _is_ going to be
>>statistical garbage.
>
>Controlling the environment of the test so that it is reproducible is probably
>in the same range of importance as the large number of games.

I guess saving the logs should be enough, who is going to reproduce a long
tournament anyway? :)

But actually I agree with you, which is why I _don't_ like that the SSDF use
books and learning. Fixed start positions give full control every single time.

>The AEGT and CEGT games do not seem to be held at a consistent time control.

That's not so good obviously, but probably the price you have to pay when making
a rating list in a big distributed manner.

>The AEGT and CEGT contests assume the NUNN positions as openings, and so they do
>not exercise the opening book of the program being tested.  That is fine to
>measure engine strength, but it will not tell you about book+program and it will
>not help you to prepare for that opponent (if it is a goal).

People tend to make their own books and in tournaments the authors always use
special (handcrafted) books, so in general it's a good idea to keep engine and
book seperated when measuring.

>The older programs have more games against them and therefore are more accurate
>as measuring tools.  But a lot of people get hot under the collar about running
>games on 450 MHz computers when they do that.

It doesn't matter if you have stable engines. What you do is you run elostat on
the whole database everytime, so ratings will automaticly rescale.

At least it seems foolish to play with an old engine if the a newer version has
been released. Remember you will still be playing with the old engine indirectly
when you play against others engines that has played against it..

>There is some inconsistent naming of the program names in CEGT and AEGT,

Such as?

>only some professional programs have a very large number of games.

You must be looking at a different list, I see
Amy, AnMon, Delfi, Zappa and many others have 1000+ games.

>There are also George and Leo's lists, all of which impart useful information.
>
>But I think it is not accurate to say that CEGT or AEGT can replace the SSDF.

SSDF is just another list. They test the same handful of Chessbase engines again
and again. Great if you're a big chessbase fan, boring if you're not.

I want more engines tested and faster. When the SSDF is out 6 month later it's
old news anyway.

-S



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.