Author: Garry Evans
Date: 21:46:02 01/11/01
Go up one level in this thread
On January 11, 2001 at 21:29:54, Dann Corbit wrote: >On January 11, 2001 at 20:48:44, James T. Walker wrote: >>You seem to be the one who is emotional and irrational now. Why go off on the >>deep end on something which is really simple. I was simply suggesting to take >>the games played by top programs in the last year or so and consider them all as >>one player. > >And I responded that this has no mathematical basis. There are many reasons >why. Let me give one model to explain why. >Fred writes faulty chess programs. All of them have a flaw that will be exposed >over time. But he writes a new program each day. If you play Fred's programs >once, you will be unlikely to find the flaw. If we play 365 games with Fred's >programs against rated opponents, we will get a rating. But if we play just one >of the programs against the same opponents we will get a wildly different >rating. > >This model sounds silly. But if you are a computer programmer, you know that it >actually models the true situation very well. > >Now, allow me to give a reasoning point. Some program such as Rebel or Hiarcs >has tendencies. These tendencies could be studied and expoited. If I play a >thousand games against one program I may learn a way to beat it. If I play a >thousand games against a thousand programs, I am far less likely to learn a way >to beat it. > >>It is perfectly logical to assume that if only one program is of GM >>strength which many people claim is not, and you add the results of other >>programs to the statistics, you are taking a worst case scenario. This is true >>because the other programs surely are not GM strength if even 1 is not GM >>strength. This might give you enough games combined to determine the "average" >>strength of top programs today vs humans. Your main contention seems to be that >>there is not enough data to determine what the strength of Rebel is but you >>don't suggest how many games vs humans it would take to establish the fact one >>way or the other. > >You will never prove it conclusively, but after a few hundred games you can >offer a statistical argument. In the case of a super GM (e.g. 2600+ ELO) you >could prove with a 2/3 probability that they were of GM (2500 ELO) strength >after only one hundred games or so. The error bar would be about 100 and hence >the odds that the center point was below 2500 would be established. > >>How many games does it take for a human to establish >>himself/herself as equal to a GM in strength? > >I think that there are two questions here. >1. What are the qualifications of a GM? >This is answered by the bylaws of FIDE [or other governing body] >2. How can we prove that someone is of GM strength? >The second is answered when we can mathematically demonstrate within an agreed >error bound that the ELO rating of a player must be at least 2500. > >Note that these are two different questions with two different answers. > >>What is GM strength? Maybe you >>can come up with a number which would satisfy most people or at least yourself. >>It's kind of like fuzzy logic. > >Let's use the definition of 2500 ELO against the same category of talent that is >necessary to obtain a GM norm. The games must be at 40/2 and the games must be >under tournament conditions. Indeed, a precise definition of what we are trying >to prove is crucial to being able to prove it. > >>It becomes an easier and simpler way to arrive >>at the answer without demanding you og exactly where you want to go on the first >>try. It's obvious that computers will never hold a GM title because has made >>this much more difficult for computers than humans. So the only thing I know to >>do is to come up with some figures which most people agree is equal to a GM. If >>you can't do this then you may never agree that computers are at last equal to a >>GM even when computers are beating the pants off of GMs. >>So what I was suggesting was to take the last X number of games by computers vs >>GMs and treat them as one player. > >This is invalid. > >> If this "Average" computer is of GM strength >>then seems to me we have some GM strength computers. > >How does one quantify "it seems to me" mathematically? > >>If they don't measure up >>now then we have not proven that there are no GM computers but at least we prove >>that as a whole they are not there yet. Of course you would want to chose the >>best few computers which will give you enough games vs humans to establish yes >>or no. (Not a C64) Say if it takes 40 or 50 games to satisfy you that computers >>have reached Gm strength then use as many of the top computer vs human games you >>need to get the 40 or 50 games. So the bottom line is if you can't decide how >>many games it takes and what rating is equal to a GM then you will never answer >>the question. > >The number of games is easily decidable, but is also a function of the >competition. The better known the ELO of the competition, the more accurate >will the rating be for the new player to be evaluated. If they have played >thousands of rated games, then they will be supremely useful tools for that >evaluation. If you look at the output of ELOSTAT (for instance) you will see a >+ and a - figure for ELO value. That represents the error bar of the >calculation for one standard deviation. That means that there is a 2/3 >probability that the actual mean lies between those two values, and a 97% chance >that it lies within a bar of double that width. > >> But if you can do that then maybe you can have the answer >>already. > >Knowing how to formulate the question properly does not mean that we already >have the answer, but it is a crucial first step. > >>Or maybe you're not interested in the answer but just like to argue. > >Passing judgement on someone's intent is always a sure sign that you have run >out of useful arguements. I don't particularly like to argue, but if I think >that someone is wrong, then I will say that I think they are wrong and I will >tell the reasons why. > >I don't see anything particularly onerous or evil in that. Actually you come off very argumentative to me, I think your more interested in showing everyone how "intelligent" you are, then what the truth is in this matter. I think that it is rather juvenile to behave this way. How old did you say you were? You remind me of a first year college student who is overly anxious to let everyone know how much he knows.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.