Author: Michael Neish
Date: 19:55:51 01/27/00
Hi, Before I get flamed, by "dummy" I mean fake. I'm not calling anyone stupid. :) Anyone could do this in a few minutes. I ran a Cadaques-style tournament between seven fictitious computer programs, i.e., seven programs play each other over matches consisting of 20 games each, 420 games in total for the whole tournament. I made the following assumptions: 1) The computers are all of equal strength. 2) The probability of a win, draw or loss are one-third each. These assumptions are for simplicity's sake only. If anyone can suggest better win/draw/loss probabilities please let me know, although for the sake of this post I don't think they make much difference. Okay, onto the results. To get the full flavour of things one would have to run many thousands of "Cadaques" tournaments, and look at the gross results. But here I will reproduce only the first ten tournament results that popped out of my program. The programs are all of equal strength, remember. Apologies if your monitor doesn't line up the figures very well. Tournament 1 HiSparks 68.5 Grits 6a 63 Terebul Mouse 61 Petunia 6 60.5 Terebul Century 57 Toddler 4 56 Bimbo 7.32 54 winner's score - loser's score = 14.5 winner's score - runner-up's score = 5.5 Tournament 2 Terebul Mouse 68.5 Toddler 4 64.5 Petunia 6 63 Bimbo 7.32 61 HiSparks 56 Terebul Century 56 Grits 6a 51 winner's score - loser's score = 17.5 winner's score - runner-up's score = 4 Tournament 3 HiSparks 72 Terebul Mouse 67 Grits 6a 62 Petunia 6 56 Bimbo 7.32 55 Toddler 4 54 Terebul Century 54 winner's score - loser's score = 18 winner's score - runner-up's score = 5 Tournament 4 Terebul Century 65.5 Grits 6a 62.5 Bimbo 7.32 61 Terebul Mouse 59.5 Toddler 4 59 Petunia 6 57.5 HiSparks 55 winner's score - loser's score = 10.5 winner's score - runner-up's score = 3 Tournament 5 Terebul Mouse 64 Grits 6a 63.5 Bimbo 7.32 61 Toddler 4 61 HiSparks 57.5 Petunia 6 57 Terebul Century 56 winner's score - loser's score = 8 winner's score - runner-up's score = 0.5 Tournament 6 Bimbo 7.32 63 Terebul Century 62.5 Terebul Mouse 62 Toddler 4 61.5 Petunia 6 60 Grits 6a 57 HiSparks 54 winner's score - loser's score = 9 winner's score - runner-up's score = 0.5 Tournament 7 Bimbo 7.32 69 Grits 6a 64 Toddler 4 62.5 Terebul Century 60.5 Petunia 6 58 HiSparks 57.5 Terebul Mouse 48.5 winner's score - loser's score = 20.5 winner's score - runner-up's score = 5 Tournament 8 HiSparks 64.5 Toddler 4 64 Terebul Century 61 Petunia 6 59.5 Terebul Mouse 59.5 Grits 6a 58.5 Bimbo 7.32 53 winner's score - loser's score = 11.5 winner's score - runner-up's score = 0.5 Tournament 9 HiSparks 63 6 0 Bimbo 7.32 63 Grits 6a 60.5 Petunia 6 59.5 Terebul Mouse 59.5 Toddler 4 57.5 Terebul Century 57 winner's score - loser's score = 6 winner's score - runner-up's score = 0 Tournament 10 Terebul Mouse 69 Bimbo 7.32 61 HiSparks 60.5 Terebul Century 59.5 Petunia 6 58 Toddler 4 57 Grits 6a 55 winner's score - loser's score = 14 winner's score - runner-up's score = 8 -------------------------------------------------- If you've got this far in the message, what does this prove? Well, I'm not sure! These are only ten simulations. But it does show that a spread is expected on statistical grounds alone. In the case of Tourney 10, there is an 8-point difference between the first and second program. In Tourney 7 there is a 20.5-point gap between the top and bottom, and also a 9-point gap between the last place and the next-to-last place. I wonder how many football managers would be pressurised into resigning for such a pitiful score in Tourney 7. Poor man -- his team is just as good as the others. But on average: First - Second program = 3.2 points First - Last program = 12.95 points Winning score = 66.7 points (= 55.6% score) Again, these are very few simulations. I didn't look at the scores for each individual match, but I'm sure there is an even greater variation within individual matches, which are then evened out a little by the fact that some programs will compensate for bad performances in one match in another match. If anyone is interested I will give the actual breakdown of the results for these same ten tournaments. It will be interesting to compare these results with the real Cadaques results once the tournament is over, although it seems that there will be a larger gap between the programs there. But of course, they are not of equal strength and I've read that there are also some problems when the Rebel programs are run on Autoplay. I hope this was interesting. It's not easy to see who is best, even in a 420-game tournament. Cheers, Mike.
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.