Author: Dann Corbit
Date: 18:56:26 08/28/00
Go up one level in this thread
On August 28, 2000 at 21:38:27, Peter Skinner wrote: >>Every little bit helps, of course, but a doubling in CPU power is only worth >>40-80 ELO (depending on who you ask). At any rate, assuming a 60 ELO >>difference, the win expectancy would only be about 60% for the double speed box, >>so it would take a huge number of games to even be able to clearly identify >>which of the two engines+CPU combinations was the strongest with "black box" >>testing. > >So really, 60 elo isn't that much, so the smaller hardware computer should do >just fine? The win expectancy for an ELO difference of 60 points is: 0.414501321328191 for the weaker program/machine combination. Which means that for a very large number of games, you would expect the stronger program to win 58.55% of the points. As you can see, that is a small difference (only about 8.55%) between two very evenly matched opponents. Now, someone hands you a coin and says, "I think this coin is biased!" You could flip it twenty times, and notice that you did not get exactly ten heads and ten tails and conclude that they were right. But if you take a fair coin and try it, you are very unlikely to get and exactly 10/10 split with one trial. Hence, we must ask ourselves: "How many trials must I make to find out if the coin is fair or not?" The answer will depend on the degree of the bias and our desired level of certainty. For programs that are very closely matched (and 60 ELO *is* very closely matched) it will be necessary to run HUNDREDS of games (at a _bare minimum_) to come up with any sort of reasonable conclusion. Testing against a single opponent (such as yourself at a different clock rate) is also a very bad procedure to find a true ELO measure. You should really have to run hundreds of games against many different and well known opponents in order to get a solid figure.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.