Author: Stephen Ham
Date: 13:11:16 12/07/05
Go up one level in this thread
On December 07, 2005 at 13:09:43, Paul Jacobean Sacral wrote: >On December 07, 2005 at 11:50:10, Stephen Ham wrote: > >>result was that Toga II won (scoring 100%!), Rybka scored 50% and Junior 9 came >>last. > >Does that mean that each engine has played only three games?? You really >shouldn't draw any conclusions from three (or six) games each, only. Even less, >if you didn't use the same book for all engines. Hello Paul, Yes, the game sample was very small. However, I've run several tests of Rybka on correspondence chess positions that I've played in the past, and am intimately familiar with as a human player. I've invested many hours on these positions in my games and hence feel qualified to judge the engine's output. I have done this with all my best engines and can thus cross-compare results (e.g. move selection, evaluation, speed of candidate selection, etc.). Please see my reviews of engines at Chesscafe.com for examples. Coupling my tests of these correspondence chess positions, with various analyses of positions found in those three games, does indeed give me a feel for Rybka's performance characteristics relative to other engines, Paul. As I made clear many times, I'm working with minimal data. Nonetheless, my analyses are qualitative while I see the work of CEGT as quantitative. Both are meaningful though. And as I stated, I plan to continue my analyses, so those were merely my initial experiences (as stated in my subject line). Paul, you made a comment regarding the use of different books. Maybe there's something to your complaint, although I respectfully disagree with you. I think that in quantitaive tests, there's probably value in sharing the same book among engines. Still, some lines within one book are good, while some are less than good/bad. So it's still a "crap shoot" which line that the engine is given. Instead, in qualitative tests, given the very small quantity of games played so far (because I'm interested exclusively in performance at long time controls) I think it best to try to match the perceived characteristics of the engine to the most appropriate book. Rybka's been touted to have positional skills exceeding that of other engines. In fact, the default setting is "positional." So it only makes sense giving it positions where it will most likely display its best features. Therefore in qualitative analysis, it seems to me there's value in not sharing the same book. After all, some books may be more suitable for the tuning of one engine and detrimental to others, producing skewed results if they all share it. Paul, I appreciate the efforts of CEGT, but I consider 40/40m to be "speed chess". Yes, speed chess will produce a larger quantity of games per unit of time. For those who play speed chess with their engines, then there's definite value in those tests. Instead, I'm interested exclusively in engine performance at classical times controls, or longer. We already have evidence that engines that dominate at blitz chess are not necessarily dominant at classical times controls, or vice versa. All the best, Steve > >CEGT has included 540 games played at 40/40m, in their rating list: > >http://kd.lab.nig.ac.jp/chess/cegt/rating-table-shifted.shtml > >Yours truly Paul J. Sacral
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.