Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: ATTN Vasik: Rybka SP?

Author: Ed Murak

Date: 20:41:50 12/15/05

Go up one level in this thread


On December 15, 2005 at 19:53:30, Bahram Namjou wrote:

>I noticed that in slow games Rybka preview "slightly positional"(SP) seems to be
>even stronger? ...It needs more games of course, but I include a set of 15 games
>here...Any comment appreciate it...
>best regards
>bn

>(book: shredder8, alternate color), P4,3.2 GhZ,128 Hash
>
>HOME-, 10'/40+10'/40+10'40  0
>
>1   Rybka 1.0 Preview 32-bit (SP)    1½1½1½½0½1½½½½0    8.5/15
>2   Rybka 1.0 Beta 32-bit (def)      0½0½0½½1½0½½½½1    6.5/15

Hi Bahram,

Vas hasn't yet replied, but I am sure he will see your results.

Can I (~ beginner) take up your offer?

Testing one version of an engine against another of the same engine can produce
results which distort actual strength differences. This subject has been debated
in many places for many years; I give you the majority (not consensus) opinion.
It is usually better to test both against one or more unrelated opponents.

Testing on one computer (single core here) is also viewed as a little
sub-optimal; I imagine you had Ponder = Permanent Brain off (also debatable, but
generally thought best off even if CPU cycles can be shared evenly).  So a
setting which was better at guessing the opponent's move does not get the same
benefit as it would have if the two engines were running in separate computers.

Also, for 15 games, the margin of error is very large (as you noted).  We can
assess this making the usual normal-distribution assumptions, compute confidence
levels etc.  Another way is to look at this - if just one game had swung the
other way (possible <.001 second extra thought somewhere?), the end result would
have been 7.5-7.5, even.  In a 15 game match even 9-6 or 10-5 is not at all
conclusive proof of superiority.  If running under ChessBase, the GUI tells you
the relative ELO bounds at different confidence %s.

So, no real conclusion possible at all, just a little evidence. Please play more
and against other strong engines.  After the endgame knowledge is added, and
Nalimov support, you could need to redo the tests. :)

IMO more can be learned if the games too are individually analyzed, maybe
annoFritzed (of course do not use RYBKA itself) but then by hand, not only at
the places Fritz/Shredder thinks are interesting.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.