Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Why SSDF list is the best

Author: Sandro Necchi

Date: 12:15:04 07/17/05

Go up one level in this thread


On July 17, 2005 at 14:19:12, Kurt Utzinger wrote:

>On July 17, 2005 at 05:22:47, Sandro Necchi wrote:
>
>>I have been laughing a lot (maybe crying on the ignorance would have been more
>>appropriate?)reading many wrong statements about testing and Elo lists.
>>
>>so, for those who are new and do not know, SSDF list is the best for the
>>following reasons:
>>
>>1. They use 2 computers and the program complete with own book and ETG, with own
>>gui and best setting as suggested by the programmer.
>>2. They use long time controls (40/2h 20/1h; international level) only.
>>3. They use the same hardware for all programs.
>>4. They use a very wide range of programs and not only the new ones to get more
>>reliable results.
>>5. Ponder on and learning are activated.
>
>      Nothing against the SSDF list and the people who are
>      testing. They are doing a great job. Nevertheless, I
>      would never go so far saying it's the only reliable
>      and therefore the best list.
>      Kurt

I said that they are the best, not that others system are not reliable at all.

To me they are the most reliable.

I keep saying this since 1987...

>
>>Now, even if some people do not agree, the use of own book is the best because
>>that book has been developed specifically for that engine and in some cases the
>>engine has been developed specifically on that book too. This means that the use
>>of a different book and the same for all programs would damage or favor some
>>programs over others. Even "neutral" books would do the same as they may include
>>variations which are not "compatible" with some sophisticated programs and be OK
>>for others.
>>I know some people do not agre on this, but this is their problem...
>
>      Here I can't agree. I am no book expert but think that the
>      the orginal books have by way of exception been developed
>      specifically for that engine. Most of these books contain too
>      much lines with either holes in them or just including lines
>      simply based on performance of (top) GM's without considering
>      if the engine does understand the position.

Only some of them do and some do it just a little.

>      A strong program
>      will not be harmed by a "neutral book" and Shredder is a very
>      good example winning all matches using 5moves.ctg, remis.ctg
>      or pre-defined openings like Nunn2, Noomen select and so on.
>      Kurt


I know that Shredder is doing very well with other systems and books too, but my
statement is not based on Shredder results, but on method to get reliable data.
Of course my statement is general and specific on the value of the data, nnot in
Shredder results.

This (interest in computer chess and chess programs) is a hobby for me, so I am
doing things I like and not to get money or other things...

>
>>The use of a "std" gui again would favor some and damage others as well, so it
>>is not advisable...
>
>      Is there any data about this?
>      Kurt

If I make a statement it is because I have made tests.
Do the same and you will agree with me.
Of course I am referring to the possibility to use own books and learning.

>
>>
>>The use of long time controls is the best to really check the max potentiality
>>of a program. It is true that the hardware used by SSDF is not updated, but 2 or
>>3 times faster hardware would not change much even if some programs may benefit
>>a little more than others (a small Elo difference).
>
>      I agree although we should accept that testing with
>      differenct time controls is the only way to get an
>      overall expression about strength and weakness of
>      a chess program.
>      Kurt

Yes, but than it would be such as "at blitz"..."at 30 minutes games" etc...

>
>>
>>Some people claim better programs against humans then computers. These are pure
>>lies as if you play better you play better against anybody. These are more
>>"commercial" statements than true ones...of course there is no relationship on
>>Elo figures on the SSDF list with those against humans, but a stronger program
>>here would do better against humans too. The problem is that in order to achive
>>reliable results there is a need of very many games. A few game may be
>>confusing.
>
>      Whilst it may indeed be very difficult to say that program X
>      is doing better vs humans than program Y - due to insufficient
>      data, say too less games - I nevertheless think that there are
>      differences. I have tried in many, many games to get draws vs
>      the best engines with my well known and boring style and I
>      I had always (and still have) the feeling that it is much easier
>      to achieve this vs Fritz and ChessTiger than against Hiarcs or
>      Gandalf. But as already said: too less data to "prove" anything.
>      Kurt

Yes, this should be based on many games and different players too...

>>
>>Thanks to the use of 2 computers one can also test against old program too. This
>>may seems useless, but it is not.
>>
>>Since the goal of SSDF list is to tell how strong is a new program to use the
>>best settings and learning is a must too because the user can use the same and
>>would like to know how strong is that program with best settings etc...
>>If some programs do not have learning features and/or good ones it is their
>>problem so they have to be penalized on that. The use of these options would do
>>this.
>>
>>So, anybody can test in a different way as they wish, but to claim that system
>>is better or replacing the SSDF system is pure nonsense!
>
>      There are not many who claim that their system would be
>      better than SSDF. On the other hand we should also accept
>      others testing methods.

I am accepting other test methods, but I am claiming SSDF is the best.

>      Personally I do not like the use
>      of (original) big books and learning. I am more interested
>      in the naked engine strength under neutral conditions and
>      as analysing tool.

I know this.

>      And for me one thing is clear: if you always
>      use the same neutral books/positions for testing you will
>      get much easier/faster comparable results between old/new
>      version of a program.

I do not agree on this. My tests and data do not let me agree on this.

>      And finally: it's anyway more than
>      boring to see an engine start thinking at move 29.
>      Kurt

Everything has advantages and disadvantages:

It many be boring, but it is boring to me to wait the computer to spend 5 or
more minutes to play an obvious move.
Playing first moves quickly allows to have more time in the middle game or in
the endgame and therefore get a better play from the program.

More importantly I do not get upset to see wrong moves and stupid positions
analysis!
>
>>
>>Sandro

Ciao
Sandro




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.