Author: James T. Walker
Date: 04:49:32 12/08/05
Go up one level in this thread
On December 07, 2005 at 20:40:46, Uri Blass wrote: >On December 07, 2005 at 20:29:06, James T. Walker wrote: > >>On December 07, 2005 at 19:09:43, Uri Blass wrote: >> >>>On December 07, 2005 at 18:51:01, James T. Walker wrote: >>> >>>>This information only makes me suspicious of CEGT testing. I'm only running >>>>blitz test but this engine is truely strange. Against Shredder 9UCI it scored >>>>about 145 Elo higher. Against Fritz 8 it scored 50 Elo lower. I only have 266 >>>>games so far but Rybka's rating now is about 33 Elo below Fruit 2.2.1 and about >>>>3 Elo below Toga II 1.0. I'm playing Rybka without an opening book since it >>>>comes without one. I don't know which book/books are being used by different >>>>testers here but it does seem to indicate that almost any book is better than >>>>nothing. >>>>Jim >>>I am not surprised that you get different results when you test different >>>things. >>> >>>differences: >>> >>>1)CEGT do not test with original opening book >>>2)CEGT test with ponder off and you test with ponder on >>>3)CEGT test x minutes/40 moves when you test blitz games >>> >>>Uri >> >>Hello Uri, >>You are right of course. The above are the reasons I don't like CEGT ratings. >>At the same time I don't like testing Rybka without an opening book. When I get >>results like +145 Elo over Shredder UCI and - 50 Elo under Fritz 8 I know >>something is not right. Maybe Rybka was "tuned" against Shredder 9 UCI? I'm >>still running games without a book untill one becomes available though it seems >>like a big disadvantage for Rybka. >>Jim > >I see nothing wrong with CEGT testing. > >They do not test playing strength of programs but they test strength of the >engine that is part of the program. > >I do not see something wrong with it. > >Uri Well I see something wrong. Guess we will have to just disagree. When testing with different book than came with engine you are not testing entire program. When testing with ponder off you are not testing entire engine. If this is your goal then I guess there is nothing wrong with it. But for me, there is something wrong with it. It does not satisfy my personal curiosity about the program. I guess we all have different agendas. Jim
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.