Author: Ed Murak
Date: 20:41:50 12/15/05
Go up one level in this thread
On December 15, 2005 at 19:53:30, Bahram Namjou wrote: >I noticed that in slow games Rybka preview "slightly positional"(SP) seems to be >even stronger? ...It needs more games of course, but I include a set of 15 games >here...Any comment appreciate it... >best regards >bn >(book: shredder8, alternate color), P4,3.2 GhZ,128 Hash > >HOME-, 10'/40+10'/40+10'40 0 > >1 Rybka 1.0 Preview 32-bit (SP) 1½1½1½½0½1½½½½0 8.5/15 >2 Rybka 1.0 Beta 32-bit (def) 0½0½0½½1½0½½½½1 6.5/15 Hi Bahram, Vas hasn't yet replied, but I am sure he will see your results. Can I (~ beginner) take up your offer? Testing one version of an engine against another of the same engine can produce results which distort actual strength differences. This subject has been debated in many places for many years; I give you the majority (not consensus) opinion. It is usually better to test both against one or more unrelated opponents. Testing on one computer (single core here) is also viewed as a little sub-optimal; I imagine you had Ponder = Permanent Brain off (also debatable, but generally thought best off even if CPU cycles can be shared evenly). So a setting which was better at guessing the opponent's move does not get the same benefit as it would have if the two engines were running in separate computers. Also, for 15 games, the margin of error is very large (as you noted). We can assess this making the usual normal-distribution assumptions, compute confidence levels etc. Another way is to look at this - if just one game had swung the other way (possible <.001 second extra thought somewhere?), the end result would have been 7.5-7.5, even. In a 15 game match even 9-6 or 10-5 is not at all conclusive proof of superiority. If running under ChessBase, the GUI tells you the relative ELO bounds at different confidence %s. So, no real conclusion possible at all, just a little evidence. Please play more and against other strong engines. After the endgame knowledge is added, and Nalimov support, you could need to redo the tests. :) IMO more can be learned if the games too are individually analyzed, maybe annoFritzed (of course do not use RYBKA itself) but then by hand, not only at the places Fritz/Shredder thinks are interesting.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.