Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: ATTN Vasik: Rybka SP?

Author: Vasik Rajlich

Date: 00:16:19 12/16/05

Go up one level in this thread


On December 16, 2005 at 00:36:50, Bahram Namjou wrote:

>Thanks for the input...There are more evidence especially for slow games that I
>played against other engines...I want him to know for a "possible"? benefit in
>his final 1.2 version but as you mentioned more evidence still
>necessary...thanks...bn
>

We haven't decided which setting will be the default in Rybka 1.2. The pruning
scheme in question might even change slightly.

Please note two things:

1) The difference between the weakest and strongest of the four settings
(whichever those are) cannot be more than 20 rating points or so. (According to
my understanding of search.) The evidence should be somewhere on the order of a
thousand games.

2) This is one setting where indeed (as Ed points out) the behavior in self play
may be different than the behavior against other engines. More data is needed.

The good news is that CEGT is testing this so there will be very good data soon.

Vas

>
>On December 16, 2005 at 00:19:10, Uri Blass wrote:
>
>>On December 15, 2005 at 23:41:50, Ed Murak wrote:
>>
>>>On December 15, 2005 at 19:53:30, Bahram Namjou wrote:
>>>
>>>>I noticed that in slow games Rybka preview "slightly positional"(SP) seems to be
>>>>even stronger? ...It needs more games of course, but I include a set of 15 games
>>>>here...Any comment appreciate it...
>>>>best regards
>>>>bn
>>>
>>>>(book: shredder8, alternate color), P4,3.2 GhZ,128 Hash
>>>>
>>>>HOME-, 10'/40+10'/40+10'40  0
>>>>
>>>>1   Rybka 1.0 Preview 32-bit (SP)    1½1½1½½0½1½½½½0    8.5/15
>>>>2   Rybka 1.0 Beta 32-bit (def)      0½0½0½½1½0½½½½1    6.5/15
>>>
>>>Hi Bahram,
>>>
>>>Vas hasn't yet replied, but I am sure he will see your results.
>>>
>>>Can I (~ beginner) take up your offer?
>>>
>>>Testing one version of an engine against another of the same engine can produce
>>>results which distort actual strength differences. This subject has been debated
>>>in many places for many years; I give you the majority (not consensus) opinion.
>>>It is usually better to test both against one or more unrelated opponents.
>>
>>I am in the minority here.
>>
>>>
>>>Testing on one computer (single core here) is also viewed as a little
>>>sub-optimal; I imagine you had Ponder = Permanent Brain off (also debatable, but
>>>generally thought best off even if CPU cycles can be shared evenly).  So a
>>>setting which was better at guessing the opponent's move does not get the same
>>>benefit as it would have if the two engines were running in separate computers.
>>
>>I think that there is no problem with ponder off.
>>ponder off may be not the best when you want to compare different engines in
>>real games because one engine may have bad time management with ponder off but
>>when you compare 2 personality then it is logical to assume that both use the
>>same time management.
>>
>>Usually there is no big difference between personalities in guessing the
>>opponent moves.
>>
>>Uri
>>
>>>
>>>Also, for 15 games, the margin of error is very large (as you noted).  We can
>>>assess this making the usual normal-distribution assumptions, compute confidence
>>>levels etc.  Another way is to look at this - if just one game had swung the
>>>other way (possible <.001 second extra thought somewhere?), the end result would
>>>have been 7.5-7.5, even.  In a 15 game match even 9-6 or 10-5 is not at all
>>>conclusive proof of superiority.  If running under ChessBase, the GUI tells you
>>>the relative ELO bounds at different confidence %s.
>>>
>>>So, no real conclusion possible at all, just a little evidence.
>>
>>Albert silver also played games with both versions and his result also give
>>little evidence that slightly positional is better.
>>
>>I also played few games and slighly positional won 3.5-2.5 so we have more
>>evidence.
>>
>>Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.