Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Why SSDF list is the best

Author: Sandro Necchi

Date: 04:02:34 07/17/05

Go up one level in this thread


On July 17, 2005 at 06:25:13, Uri Blass wrote:

>On July 17, 2005 at 06:15:23, Uri Blass wrote:
>
>>On July 17, 2005 at 05:22:47, Sandro Necchi wrote:
>>
>>>I have been laughing a lot (maybe crying on the ignorance would have been more
>>>appropriate?)reading many wrong statements about testing and Elo lists.
>>>
>>>so, for those who are new and do not know, SSDF list is the best for the
>>>following reasons:
>>>
>>>1. They use 2 computers and the program complete with own book and ETG, with own
>>>gui and best setting as suggested by the programmer.
>>>2. They use long time controls (40/2h 20/1h; international level) only.
>>>3. They use the same hardware for all programs.
>>>4. They use a very wide range of programs and not only the new ones to get more
>>>reliable results.
>>>5. Ponder on and learning are activated.
>>>
>>>Now, even if some people do not agree, the use of own book is the best because
>>>that book has been developed specifically for that engine and in some cases the
>>>engine has been developed specifically on that book too.
>>
>>I think that in order to prove it you need to test programs also with different
>>book.
>>
>>I do not know of these type of testing
>>
>>I do not know if Shredder with your book perform better than Shredder with
>>Fritz8.ctg
>>I do not know if Fritz with Fritz8.ctg performs better than Fritz with your
>>book.
>>
>>
>> This means that the use
>>>of a different book and the same for all programs would damage or favor some
>>>programs over others.
>>
>>
>>I think that a lot of people are interested in a different question.
>>The question which program is stronger in ssdf match is one question and the
>>question which program to analyze positions is a different question.
>>
>>using books and learning in tests does not help to decide which program is
>>better to use in overnight analysis.
>>
>>I prefer the CEGT rating list and not the ssdf rating list in decision which
>>programs to use in correspondence games.
>>
>>The only advantage of the ssdf is longer time control but the CEGT has other
>>advantages like having more programs(Fruit was not tested in the ssdf list).
>>
>>Uri

Uri,

>
>I can add that book learning also cause problems to get statistical analysis of
>the results because games are not independent events.

Since the goal of SSDF list is to tell the user if the new version is better,
you cannot remove the learning as that option will be used by the user too. This
is why needs to be included otherwise you will give data diffent that what the
user can find.

>
>I hope that games in the CEGT are independent events(I am not sure because it is
>possible that some programs still earn there from positional learning).
>
>When games are not independent events because of learning then it is not clear
>what is the standard error in the result and not having learning  from previous
>games can produce smaller error with the same number of games.

This is true, but I know the SSDF people have enough experience (and suggestions
from the programmers too) to do things in the correct way.

>
>imagine that there is program A and program B when B is deterministic without
>learning.
>If A is lucky to win the first 2 games it may win a SSDF match 40-0 by repeating
>the first 2 games
>If A is unlucky then the result may be 21-19 for B because B may win the first 2
>games and draw games 3 and 4 when A will learn that B is probably stronger than
>it so it will repeat the draw lines again and again in the rest of the games.

I know, but this will happen against the human owner too, so it must be included
in the test as otherwise again a different picture of the program would be
given.

>
>Of course this is extreme condition that does not happen but the point is that I
>have no idea how to analyze possible error in the ssdf list because of the noise
>of learning.

Unfortunately learning is necessary as without it the program would be simply
too stupid and loose again and again the same games...shall we go back 10 or
more years?

>
>Uri

Sandro



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.