Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: ATTN Vasik: Rybka SP?

Author: Uri Blass

Date: 21:19:10 12/15/05

Go up one level in this thread


On December 15, 2005 at 23:41:50, Ed Murak wrote:

>On December 15, 2005 at 19:53:30, Bahram Namjou wrote:
>
>>I noticed that in slow games Rybka preview "slightly positional"(SP) seems to be
>>even stronger? ...It needs more games of course, but I include a set of 15 games
>>here...Any comment appreciate it...
>>best regards
>>bn
>
>>(book: shredder8, alternate color), P4,3.2 GhZ,128 Hash
>>
>>HOME-, 10'/40+10'/40+10'40  0
>>
>>1   Rybka 1.0 Preview 32-bit (SP)    1½1½1½½0½1½½½½0    8.5/15
>>2   Rybka 1.0 Beta 32-bit (def)      0½0½0½½1½0½½½½1    6.5/15
>
>Hi Bahram,
>
>Vas hasn't yet replied, but I am sure he will see your results.
>
>Can I (~ beginner) take up your offer?
>
>Testing one version of an engine against another of the same engine can produce
>results which distort actual strength differences. This subject has been debated
>in many places for many years; I give you the majority (not consensus) opinion.
>It is usually better to test both against one or more unrelated opponents.

I am in the minority here.

>
>Testing on one computer (single core here) is also viewed as a little
>sub-optimal; I imagine you had Ponder = Permanent Brain off (also debatable, but
>generally thought best off even if CPU cycles can be shared evenly).  So a
>setting which was better at guessing the opponent's move does not get the same
>benefit as it would have if the two engines were running in separate computers.

I think that there is no problem with ponder off.
ponder off may be not the best when you want to compare different engines in
real games because one engine may have bad time management with ponder off but
when you compare 2 personality then it is logical to assume that both use the
same time management.

Usually there is no big difference between personalities in guessing the
opponent moves.

Uri

>
>Also, for 15 games, the margin of error is very large (as you noted).  We can
>assess this making the usual normal-distribution assumptions, compute confidence
>levels etc.  Another way is to look at this - if just one game had swung the
>other way (possible <.001 second extra thought somewhere?), the end result would
>have been 7.5-7.5, even.  In a 15 game match even 9-6 or 10-5 is not at all
>conclusive proof of superiority.  If running under ChessBase, the GUI tells you
>the relative ELO bounds at different confidence %s.
>
>So, no real conclusion possible at all, just a little evidence.

Albert silver also played games with both versions and his result also give
little evidence that slightly positional is better.

I also played few games and slighly positional won 3.5-2.5 so we have more
evidence.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.