Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: rebel 10~!! super strong on amd k62 500

Author: Peter Kappler

Date: 09:56:41 07/29/00

Go up one level in this thread


On July 29, 2000 at 02:31:59, Ricardo Gibert wrote:

>On July 28, 2000 at 12:34:06, Peter Kappler wrote:
>
>>On July 28, 2000 at 06:33:57, Ricardo Gibert wrote:
>>
>>>On July 28, 2000 at 02:12:50, Peter Kappler wrote:
>>>
>>>>On July 28, 2000 at 01:23:55, Dann Corbit wrote:
>>>>
>>>>>On July 28, 2000 at 01:15:46, Peter Kappler wrote:
>>>>>
>>>>>>On July 28, 2000 at 00:50:09, Ratko V Tomic wrote:
>>>>>>
>>>>>>>Well, you're unjust to Thorsten. The rating calculations
>>>>>>>extract very little data from each game, about 1.58 bits
>>>>>>>per game (i.e. log2(3)). On the other hand, each ply contains
>>>>>>>about 5-6 bits of data, or for a 100 ply game you have 500
>>>>>>>bits of data produced. Hence the conventional rating tests
>>>>>>>based on the 3-way game result are very highly inefficient,
>>>>>>>they keep about 0.3 percent of info produced in game.
>>>>>>
>>>>>>Why 5-6 bits per ply?  Just enough to represent an appoximate evaluation of the
>>>>>>position?
>>>>>>
>>>>>>>
>>>>>>>The advantage of ratings to the more efficient information
>>>>>>>extractors (such as human brain) is that one can compute
>>>>>>>such rating without even knowing how to play chess. Another
>>>>>>>advantage is that they're not biased by human subjective judgment
>>>>>>>(the ratings may manifest other biases which reduce their
>>>>>>>predictive power, especially when extrapolating to a new opponent
>>>>>>>from a small number of earlier opponents). A human chess player
>>>>>>>likely extracts 100 times more info per game than the mechanical
>>>>>>>rating calculator, and the stronger the player the more info he
>>>>>>>can extract.
>>>>>>>
>>>>>><snip>
>>>>>>
>>>>>>
>>>>>>Well said.  I have always felt this way, and seeing the idea explained so
>>>>>>eloquently is comforting in a strange way. :)
>>>>>
>>>>>I don't believe it for a minute.
>>>>>
>>>>>I have seen too many times when someone is completely wrong in their assessments
>>>>>to fall for it.
>>>>
>>>>
>>>>What he says makes more sense if you assume a strong player is making the
>>>>assessments.
>>>>
>>>>I'd venture that a GM can estimate a player's rating to within +/- 200 points by
>>>>just analyzing one game.  I think the success rate would be at least 80%.
>>>>
>>>No. I'm over 2200 USCF and I don't think this is a good way to estimate a
>>>players ability. There are several reasons why I think this. Some based on
>>>practical experience and some based on my understanding of statistics.
>>>
>>
>>OK, but I'm around 2100 USCF, so I think my opinion counts, too.  :)
>>
>>
>>>I remember playing an A-player in a tournament and I he was able to create an
>>>incredible amount of pressure in the middlegame. He kept finding incredible
>>>moves I thought no A-player could find. I was barely able to survive and had
>>>come to the conclusion he was way under-rated and that a draw would be a good
>>>result for me, despite the rating differential.
>>>
>>>I made it to an endgame with meager chances to draw. That was when the "strong
>>>player" vanished and he started to play like a C-player. He didn't blow the game
>>>in one move. He made a series of weak moves to blow the game and I wound up
>>>winning!
>>>
>>
>>Remember that I said I thought the GM's success rate would only be 80% given a
>>one game sample.  You'll always be able to pick an "outlier" game where a player
>>performed well above or below their true strength.  (Though, even in your
>>example above, the guy finally showed his true colors at the end.)
>>
>>>He was 2400 strength in the _particular_ middlegame we played, but only 1500
>>>strength in the endgame. This was a player with _big_ holes in his make-up as a
>>>player. A lot of strong players would have folded up in the middlegame and come
>>>away with the impression this guy was super strong.
>>>
>>>Another possibility is a different kind of middlegame (a closed position) would
>>>have revealed his weaknesses as a player. It all depends on the player.
>>>
>>>>And if you gave him 4 or 5 games to analyze, I'd probably have more faith in the
>>>>GM's estimate than the player's actual rating.  :)
>>>>
>>>No. It is even possible in 4 or 5 games that a player is able to get positions
>>>that complement his playing style and he looks like he can do no wrong. There is
>>>no substitute for an objective assessment using a large number games against a
>>>_variety_ of players.
>>>
>>
>>Sure, more games is better.  5 games definitely isn't enough if you happen to
>>pick a set of exceptionally good/bad games.  I guess my main point is that for a
>>given samples size, a GM will do a MUCH better job of estimating playing
>>strength than the ELO formula.
>>
>>
>>>A friend of mine, about 2100 strength had a record of 5-0 (slow OTB tournament
>>>play) against IM Kamran Shirazi (2550-2600 strngth). Their respective styles
>>>were such that he would beat the crap out of him in every game. Luck had nothing
>>>to do wih it. In a sixth game, he was crushing him also, but his habitual time
>>>trouble allowed Shirazi to limp away with a draw. My friend was not a very good
>>>blitz player and spoiled a lot of games in the move 30-40 range.
>>>
>>>I think you must conclude that 4-5 games are not enough or my friend is as
>>>strong as Kasparov. Which is it?
>>>
>>
>>I conclude that 4-5 games isn't always enough, especially when they are not
>>selected randomly.  :)
>>
>>By the way, the ELO system *would* say that your friend is World Champion
>>strength based on those 4-5 games.  A good GM would realize this is not the
>>case.
>>
>Nope! The Elo system says he is 2100. Those games are from his personal record
>against this one particular player. They are from distinct tournaments spread
>out over a couple of years or so. The Elo system says he is 2100 strength and he
>is 2100 strength.


No, the ELO system says he he is 2950-3000 for *just* those 4-5 games.  If you
want to compare the accuracy of ELO vs GM, you must restrict the rating
calculation to the same set of games.


> It does not matter if the GM agrees or not. It is an objective
>measure. You can't argue against an objective measure. A GM assessment is not an
>objective measure, so it can be and should be argued against. It is not
>scientific. It never will be. Remember, GMs disagree all the time. Which GM is
>right?
>

I guess you missed my point.  I was never arguing that the GM can do more with
4-5 games than ELO can do with several hundred.  I'm just saying that for a
given small sample, the GM extracts much more information per game, and can
therefore product a better estimate.

--Peter



>>>Chess playing programs can be similar. The respective opening books can slant
>>>the outcome greatly in one direction or the other. Using the same opening book
>>>does not help, since the types of positions resulting may be limited and slant
>>>things greatly in favor of one program.
>>>
>>>To determine strength accurately, a player, computer or human, needs to be
>>>tested against a random sample from a _population_ of players. This is what a
>>>book on statistics will tell you.
>>>
>>
>>I really do understand the statistics.  Perhaps my examples were a bit extreme,
>>but I also think you missed my main point...
>>
>>--Peter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.