Author: Enrique Irazoqui
Date: 15:51:13 02/12/99
Go up one level in this thread
On February 12, 1999 at 14:06:49, KarinsDad wrote: >Could a large test suite (200+ positions) with random opening, middlegame, and >endgame positions be created that could then be compared against programs? > >Would this make more sense as compared to the contrived test suites which >attempt to have weird or difficult positions to analyze? I don’t think it is just a matter of quantity. I know that some programmers use test suites of a few hundred positions, even one thousand in one case, and they are still wise enough to mistrust them completely when they have to decide which beta version is best. You can keep adding positions in this sort of brute force approach, without knowing if these positions are valid samples of what a program will have to deal with in real life games. Tiger comes to mind. In the endings, it has no idea about bad bishops, Philidor and Lucena endings, draws in Kp vs. KQ when the pawn is on the seventh rank and a, c, f, h files, etc. Still, Tiger is better than most in the endgames I have seen it play. Instead, it does badly in endgame tests, because these tests are full of Lucenas and Philidors and bad bishops, and they don’t reflect real life. I mean: how many positions have to check the playing ability of a program in specific rook endings like the Philidor? How many and of which kind about the passed pawn evaluations? Etc. Go brute force and you will end up with a result that won’t work as a reflection of reality. And this is only the endgame, so much more systematically structured and studied than the middlegame. How do you approach the middlegame test sets? What kind of tactics we examine, what kind of positional knowledge, in which proportion? Imagine the disaster of an IQ test consisting of 200 or 1000 questions put together with the brute force approach, without prior and systematic knowledge about the weight of each question in relation to what the tester is looking for. And after all we are talking about an IQ for programs. We just don't have this kind of systematic, explanatoty knowledge of chess. So we better rely on "intuition" than test sets, no? It is a more scientific approach. :) Enrique >KarinsDad
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.