Author: Peter Kappler
Date: 09:56:41 07/29/00
Go up one level in this thread
On July 29, 2000 at 02:31:59, Ricardo Gibert wrote: >On July 28, 2000 at 12:34:06, Peter Kappler wrote: > >>On July 28, 2000 at 06:33:57, Ricardo Gibert wrote: >> >>>On July 28, 2000 at 02:12:50, Peter Kappler wrote: >>> >>>>On July 28, 2000 at 01:23:55, Dann Corbit wrote: >>>> >>>>>On July 28, 2000 at 01:15:46, Peter Kappler wrote: >>>>> >>>>>>On July 28, 2000 at 00:50:09, Ratko V Tomic wrote: >>>>>> >>>>>>>Well, you're unjust to Thorsten. The rating calculations >>>>>>>extract very little data from each game, about 1.58 bits >>>>>>>per game (i.e. log2(3)). On the other hand, each ply contains >>>>>>>about 5-6 bits of data, or for a 100 ply game you have 500 >>>>>>>bits of data produced. Hence the conventional rating tests >>>>>>>based on the 3-way game result are very highly inefficient, >>>>>>>they keep about 0.3 percent of info produced in game. >>>>>> >>>>>>Why 5-6 bits per ply? Just enough to represent an appoximate evaluation of the >>>>>>position? >>>>>> >>>>>>> >>>>>>>The advantage of ratings to the more efficient information >>>>>>>extractors (such as human brain) is that one can compute >>>>>>>such rating without even knowing how to play chess. Another >>>>>>>advantage is that they're not biased by human subjective judgment >>>>>>>(the ratings may manifest other biases which reduce their >>>>>>>predictive power, especially when extrapolating to a new opponent >>>>>>>from a small number of earlier opponents). A human chess player >>>>>>>likely extracts 100 times more info per game than the mechanical >>>>>>>rating calculator, and the stronger the player the more info he >>>>>>>can extract. >>>>>>> >>>>>><snip> >>>>>> >>>>>> >>>>>>Well said. I have always felt this way, and seeing the idea explained so >>>>>>eloquently is comforting in a strange way. :) >>>>> >>>>>I don't believe it for a minute. >>>>> >>>>>I have seen too many times when someone is completely wrong in their assessments >>>>>to fall for it. >>>> >>>> >>>>What he says makes more sense if you assume a strong player is making the >>>>assessments. >>>> >>>>I'd venture that a GM can estimate a player's rating to within +/- 200 points by >>>>just analyzing one game. I think the success rate would be at least 80%. >>>> >>>No. I'm over 2200 USCF and I don't think this is a good way to estimate a >>>players ability. There are several reasons why I think this. Some based on >>>practical experience and some based on my understanding of statistics. >>> >> >>OK, but I'm around 2100 USCF, so I think my opinion counts, too. :) >> >> >>>I remember playing an A-player in a tournament and I he was able to create an >>>incredible amount of pressure in the middlegame. He kept finding incredible >>>moves I thought no A-player could find. I was barely able to survive and had >>>come to the conclusion he was way under-rated and that a draw would be a good >>>result for me, despite the rating differential. >>> >>>I made it to an endgame with meager chances to draw. That was when the "strong >>>player" vanished and he started to play like a C-player. He didn't blow the game >>>in one move. He made a series of weak moves to blow the game and I wound up >>>winning! >>> >> >>Remember that I said I thought the GM's success rate would only be 80% given a >>one game sample. You'll always be able to pick an "outlier" game where a player >>performed well above or below their true strength. (Though, even in your >>example above, the guy finally showed his true colors at the end.) >> >>>He was 2400 strength in the _particular_ middlegame we played, but only 1500 >>>strength in the endgame. This was a player with _big_ holes in his make-up as a >>>player. A lot of strong players would have folded up in the middlegame and come >>>away with the impression this guy was super strong. >>> >>>Another possibility is a different kind of middlegame (a closed position) would >>>have revealed his weaknesses as a player. It all depends on the player. >>> >>>>And if you gave him 4 or 5 games to analyze, I'd probably have more faith in the >>>>GM's estimate than the player's actual rating. :) >>>> >>>No. It is even possible in 4 or 5 games that a player is able to get positions >>>that complement his playing style and he looks like he can do no wrong. There is >>>no substitute for an objective assessment using a large number games against a >>>_variety_ of players. >>> >> >>Sure, more games is better. 5 games definitely isn't enough if you happen to >>pick a set of exceptionally good/bad games. I guess my main point is that for a >>given samples size, a GM will do a MUCH better job of estimating playing >>strength than the ELO formula. >> >> >>>A friend of mine, about 2100 strength had a record of 5-0 (slow OTB tournament >>>play) against IM Kamran Shirazi (2550-2600 strngth). Their respective styles >>>were such that he would beat the crap out of him in every game. Luck had nothing >>>to do wih it. In a sixth game, he was crushing him also, but his habitual time >>>trouble allowed Shirazi to limp away with a draw. My friend was not a very good >>>blitz player and spoiled a lot of games in the move 30-40 range. >>> >>>I think you must conclude that 4-5 games are not enough or my friend is as >>>strong as Kasparov. Which is it? >>> >> >>I conclude that 4-5 games isn't always enough, especially when they are not >>selected randomly. :) >> >>By the way, the ELO system *would* say that your friend is World Champion >>strength based on those 4-5 games. A good GM would realize this is not the >>case. >> >Nope! The Elo system says he is 2100. Those games are from his personal record >against this one particular player. They are from distinct tournaments spread >out over a couple of years or so. The Elo system says he is 2100 strength and he >is 2100 strength. No, the ELO system says he he is 2950-3000 for *just* those 4-5 games. If you want to compare the accuracy of ELO vs GM, you must restrict the rating calculation to the same set of games. > It does not matter if the GM agrees or not. It is an objective >measure. You can't argue against an objective measure. A GM assessment is not an >objective measure, so it can be and should be argued against. It is not >scientific. It never will be. Remember, GMs disagree all the time. Which GM is >right? > I guess you missed my point. I was never arguing that the GM can do more with 4-5 games than ELO can do with several hundred. I'm just saying that for a given small sample, the GM extracts much more information per game, and can therefore product a better estimate. --Peter >>>Chess playing programs can be similar. The respective opening books can slant >>>the outcome greatly in one direction or the other. Using the same opening book >>>does not help, since the types of positions resulting may be limited and slant >>>things greatly in favor of one program. >>> >>>To determine strength accurately, a player, computer or human, needs to be >>>tested against a random sample from a _population_ of players. This is what a >>>book on statistics will tell you. >>> >> >>I really do understand the statistics. Perhaps my examples were a bit extreme, >>but I also think you missed my main point... >> >>--Peter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.