Author: F. Jermann
Date: 05:17:49 01/28/00
Go up one level in this thread
On January 28, 2000 at 06:38:46, Jouni Uski wrote: >On January 28, 2000 at 05:28:11, Matthias Wuellenweber wrote: > >>If you take a whole game as one probabilistic event, the number of games needed >>to ascertain playing strength rankings seems depressing and Christophe's program >>pointedly illustrates this. The error margin goes down only with roughly >>1/sqrt(N). >> >>However from practical experience this doesn't feel right, the result >>fluctuation seems narrower than expected from statistical distributions. >> >>As my old buddy Thorsten Czubics, an eminent critic of statistics, always used >>to say: "Pah, I only need to look at one game to see whether a program is good". >>I think there is a grain of truth in this. >> >>A computer chess game is not a single random event but a string of them. There >>are N crucial turning points in a game where finding "the better move" could >>strongly influence or even decide the outcome of the game. For each of those N >>crucial points the stronger program has a certain chance to succeed, the weaker >>program a chance to stumble. >> >>N could be quite high, not much lower than the game length in full moves. >> >>This means that one needs much less games to measure relative playing strength >>than expected from the "one result = one chance event" angle. >> >>A better way to undermine the overconfidence in result counting could be the >>disturbing influence of hardware and time controls. Hiarcs 7.32 seems to get >>problems against the brand new programs on fast machines at long time controls. >>However it always shines brilliantly in Blitz on, say, 500Mhz. >> >>Matthias Wüllenweber > >I have seen many calculation about how many games is needed to find out better >program. E.g. in SSDF list You need at least 200 games to get good indication >about strength. But among human players much less games is needed. Why? >With only about 10 games per tournament usually the best player (=Kasparov) >wins! What's the difference? > >Jouni I think the difference is the gap in playing strength between Kasparov and the others! If you take Fritz 5.32 for example and let it play several blitz tournaments with 10 or 12 rounds against public domain winboard engines, you will see it wins almost every tournament (like Kasparov).
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.