Author: Robert Hyatt
Date: 09:03:20 12/29/97
Go up one level in this thread
On December 29, 1997 at 11:42:08, Don Dailey wrote: >On December 29, 1997 at 01:34:35, Bruce Moreland wrote: > >> >>On December 28, 1997 at 23:38:12, Don Dailey wrote: >> >>>I did a really interesting study once several years ago. I took >>>a small problem set and adjusted the weights to predict the Swedish >>>ratings of several programs. You can use various methods to do >>>this, I used a genetic algorithm. I was able to come up with a >>>formula which was very accurate, within about 10 points for ANY >>>program that was involved in the test. >> >>If you produced a formula that would accurately predict the Elo of 75% >>of the known programs, would it accurately predict the Elo of the >>remaining 25% without tweaking it? >> >>bruce > >That's a great question. This one could be tested without too much >trouble if I were to repeat the test. > >I have a feeling the important thing is to start with as many programs >as possible. If I tuned to 2 or 3 programs it would not predict >well because it could take too many liberties to get those ratings >just right. But if I started with many it might be "forced" to come >up with realistic weights that reflected some kind of reality. > >But of course with more programs the procedure would probably not >do as well with the worst case, it's unlikely I would get within >10 rating points with all my initial testee's. > >-- Don As I read this I chuckled internally, thinking of the test Larry Kaufman wanted to try on Cray Blitz in Indianapolis at the ACM event. He had a formula he was convinced was *very* accurate in matching a program's results to it's "SSDF" equivalent rating. You probably remember the humerous result, where Cray Blitz solved almost all of the test positions in under a second, which totally blew his formula out the window, because CB was *not* a 2600+ program. I don't trust any formula that was "fit" to a known set of programs. That's a simple least-squares solution to fitting a polynomial to a known set of data points, and obviously it is accurate for the points along the curve, since that's how the curve was derived in the first place. But when you toss in a new program that is *not* similar to the others (CSTal comes to mind) then this "formula" is not just wrong, but *badly* wrong. Ditto for any program that is somehow different from the programs used to produce the formula... The basic flaw here is called "statistical inbreeding"...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.