Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: New SSDF list

Author: Robert Hyatt

Date: 07:02:01 11/28/99

Go up one level in this thread


On November 28, 1999 at 05:27:09, Ed Schröder wrote:

>>Posted by Robert Hyatt on November 27, 1999 at 22:28:32:
>>
>>In Reply to: Re: New SSDF list posted by Fernando Villegas on November 27,
>>1999 at 19:46:52:
>>
>>On November 27, 1999 at 19:46:52, Fernando Villegas wrote:
>>
>>>On November 27, 1999 at 17:33:13, Robert Hyatt wrote:
>>>
>>>>On November 27, 1999 at 12:18:07, Fernando Villegas wrote:
>>>>
>>>>>I do not understand your point, Bob. This is not a match between two
>>computers,
>>>>>but many. How a program could do well just tuning against Tiger? Maybe that
>>>>>could mean to un-tune against any other of the concurrence. Maybe some
>>opening
>>>>>preparations, but...
>>>>>Fernando
>>>>
>>>>
>>>>This is easy.  A year ago, due to some unusual new eval features I added, I
>>>>ended up with a version that had very little trouble with Fritz 5 at any
>>time
>>>>control.  It won so many games that Lonnie accused me of using a Cray to
>>play
>>>>against him.  If I sent _that_ version to the SSDF for testing, it would
>>have
>>>>done very well against fritz, because fritz would be totally unprepared.
>>But
>>>>once they saw what was happening, some adjusting on their end (king
>>safety and
>>>>passed pawns in particular) and this advantage would have eyvaporated.
>>>>
>>>>Almost always the _last released_ program goes to the top of the SSDF.
>>In this
>>>>case, it is an _unreleased_ version, which means _nobody_ had a chance to
>>look
>>>>at the book, and the depth, at the evals, and find out what it is doing....
>>>>
>>>>Sort of an "element of surprise"...
>>>
>>>
>>>Please let me clear this issue a little more.
>>>a) SSDF testing is not made by the programmers so they couldn't tune his
>>>programs according to every new opponent.
>>
>>I didn't say (or imply) that they did.  I am simply saying that Tiger has
>>had ample opportunity to play against other programs in private...  and that
>>once it becomes public, other programs will have ample opportunity to play
>>against it.  And as usually happens, things will then change in surprising
>>ways...
>>
>>It has _nothing_ to do with the SSDF...  just that the program has not been
>>'seen' by anyone else.  You will be surprised what you can learn about a
>>program after watching its analysis for a while...  So far, that hasn't
>>happened.  But it will...
>>
>>
>>
>>
>>
>>>b)Being so, if, let us say, anyway F6 is delivered to the swedish people
>>after
>>>being tuned against Tiger 12, my question is, what would happen to F6 against
>>>other programs? Why tunning against program X neccesarily means the
>>likelihood
>>>to get more points in a pool constituted by many opponents different to X?
>>
>>It happens all the time.  At one point Ed had 8 machines running auto232
>>matches
>>against his 'competition'.  If one program beats you consistently, you can
>>find
>>out what it is seeing that you are not, and fix that without breaking yourself
>>vs other opponents...
>
>
>I hardly test on 40/120 level these days. The last time I did was exactly
>one year ago (november '98) to see how well Tiger (version 11.5 at that
>time) would do against the then current top programs for an impression
>only. And that was on relative slow PII-266 machines not comparable
>with the hardware SSDF is using.
>
>The scenario you describe above does not exist, see also Christophe's
>comments about this, no tuning against other chess programs.
>
>Ed
>
>


Then that begs the question:  how is he 'testing'?  He isn't playing in any
public human events on any significant number of games.  If you don't enter a
lot of human tournaments, and you don't play against computers, then what do
you test against?  Certainly _not_ against yourself...  That won't fly...

However, in my case, I really don't give a swat...  I have more than enough to
work on.. :)





>
>>>c) That is the core of my question and the only way I found to understand
>>your
>>>point is to suppose that Tiger 12 brings not only some specific features that
>>>could be anulated so and so, but general, universal improvements so if you
>>tune
>>>againts them, you improve your own program "in general"
>>
>>That is possible as well.  Several years ago I added "outside passed pawn"
>>code to Crafty.  At that time, hardly any commercial programs did anything
>>with this and as a result, crafty won many a endgame due to this.  It wasn't
>>long before it worked less frequently.  Ditto for the trapped bishop on a7.t
>>A few 'fixed it'.  A few fixed it before I started evaluating it myself.  But
>>not everybody...  a couple of programs _still_ fall for it.
>>
>>Another good example is king safety.  I don't know of any program (so far)
>>that
>>is very good handling king safety.  Once someone does a program that is
>>really a
>>strong attacker, everyone will either fix it finally, or get rolled into small
>>balls over and over.  (Older versions of Genius suffered badly from this,
>>although I don't know about the newest one).
>>
>>
>>
>>
>>
>>
>>>d) In this way, tunning againts Tiger 12 would means tunning against any
>>other
>>>program.
>>
>>And quite possible tuning to do _worse_ against human players.  Which is not
>>something I am ready to do yet, myself.  But for those driven by SSDF
>>rankings,
>>anything goes..
>>
>>
>>
>>>e) If that is not the case, I still believe that in a tournament where you
>>face
>>>many different opponents, just to prepare againts one of them could means a
>>>worst general result, at least you can do the same thing before every
>>game, but
>>>we already know you cannot do.
>>>Sorry is this seems a little bit confused. It is :-)
>>>Fernando



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.