Author: Ratko V Tomic
Date: 11:27:08 10/06/99
Go up one level in this thread
Good points about learning. For a reproducable and fair test with learning on one would need to symmetrize the sequence of matchups, by reinstalling the programs from scratch and permuting their order. That would take too much time. > Obviously to say "20 games are enough, i don't care about > statistic-theory" shows that you are kind of , uhm, "special". I think this is a bit harsh judgment. Namely a typical game may have over 50 moves where program's judgment is on the line, so you have really a test suite of over 1000 positions on which you can look at the quality/strength of the program's "thinking". Most of us here, when we get a new chess program, play right away several games against it (usually after running the program through a handful of our favorite test positions, we tested other programs with), and just within couple games we have a pretty good idea on how strong the program is. After all, by that point we have had several hundreds of positions we thought through in depth along with the program, with immediate comparisons & feedback of our judgments. In real life situations calling for evaluations (be it in hiring someone for a job or marriage, or buying a car), you usually have much smaller sample of situations after which you need to issue a judgement. Technically small samples of games are still Ok if examined intelligently. Only a mechanistic/blind point counting method of testing needs large samples, since out of the whole game it keeps under 2 bits (1.5849..) of information (from a game which has thousand or more bits of information).
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.