Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: rebel 10~!! super strong on amd k62 500

Author: Ricardo Gibert

Date: 23:31:59 07/28/00

Go up one level in this thread


On July 28, 2000 at 12:34:06, Peter Kappler wrote:

>On July 28, 2000 at 06:33:57, Ricardo Gibert wrote:
>
>>On July 28, 2000 at 02:12:50, Peter Kappler wrote:
>>
>>>On July 28, 2000 at 01:23:55, Dann Corbit wrote:
>>>
>>>>On July 28, 2000 at 01:15:46, Peter Kappler wrote:
>>>>
>>>>>On July 28, 2000 at 00:50:09, Ratko V Tomic wrote:
>>>>>
>>>>>>Well, you're unjust to Thorsten. The rating calculations
>>>>>>extract very little data from each game, about 1.58 bits
>>>>>>per game (i.e. log2(3)). On the other hand, each ply contains
>>>>>>about 5-6 bits of data, or for a 100 ply game you have 500
>>>>>>bits of data produced. Hence the conventional rating tests
>>>>>>based on the 3-way game result are very highly inefficient,
>>>>>>they keep about 0.3 percent of info produced in game.
>>>>>
>>>>>Why 5-6 bits per ply?  Just enough to represent an appoximate evaluation of the
>>>>>position?
>>>>>
>>>>>>
>>>>>>The advantage of ratings to the more efficient information
>>>>>>extractors (such as human brain) is that one can compute
>>>>>>such rating without even knowing how to play chess. Another
>>>>>>advantage is that they're not biased by human subjective judgment
>>>>>>(the ratings may manifest other biases which reduce their
>>>>>>predictive power, especially when extrapolating to a new opponent
>>>>>>from a small number of earlier opponents). A human chess player
>>>>>>likely extracts 100 times more info per game than the mechanical
>>>>>>rating calculator, and the stronger the player the more info he
>>>>>>can extract.
>>>>>>
>>>>><snip>
>>>>>
>>>>>
>>>>>Well said.  I have always felt this way, and seeing the idea explained so
>>>>>eloquently is comforting in a strange way. :)
>>>>
>>>>I don't believe it for a minute.
>>>>
>>>>I have seen too many times when someone is completely wrong in their assessments
>>>>to fall for it.
>>>
>>>
>>>What he says makes more sense if you assume a strong player is making the
>>>assessments.
>>>
>>>I'd venture that a GM can estimate a player's rating to within +/- 200 points by
>>>just analyzing one game.  I think the success rate would be at least 80%.
>>>
>>No. I'm over 2200 USCF and I don't think this is a good way to estimate a
>>players ability. There are several reasons why I think this. Some based on
>>practical experience and some based on my understanding of statistics.
>>
>
>OK, but I'm around 2100 USCF, so I think my opinion counts, too.  :)
>
>
>>I remember playing an A-player in a tournament and I he was able to create an
>>incredible amount of pressure in the middlegame. He kept finding incredible
>>moves I thought no A-player could find. I was barely able to survive and had
>>come to the conclusion he was way under-rated and that a draw would be a good
>>result for me, despite the rating differential.
>>
>>I made it to an endgame with meager chances to draw. That was when the "strong
>>player" vanished and he started to play like a C-player. He didn't blow the game
>>in one move. He made a series of weak moves to blow the game and I wound up
>>winning!
>>
>
>Remember that I said I thought the GM's success rate would only be 80% given a
>one game sample.  You'll always be able to pick an "outlier" game where a player
>performed well above or below their true strength.  (Though, even in your
>example above, the guy finally showed his true colors at the end.)
>
>>He was 2400 strength in the _particular_ middlegame we played, but only 1500
>>strength in the endgame. This was a player with _big_ holes in his make-up as a
>>player. A lot of strong players would have folded up in the middlegame and come
>>away with the impression this guy was super strong.
>>
>>Another possibility is a different kind of middlegame (a closed position) would
>>have revealed his weaknesses as a player. It all depends on the player.
>>
>>>And if you gave him 4 or 5 games to analyze, I'd probably have more faith in the
>>>GM's estimate than the player's actual rating.  :)
>>>
>>No. It is even possible in 4 or 5 games that a player is able to get positions
>>that complement his playing style and he looks like he can do no wrong. There is
>>no substitute for an objective assessment using a large number games against a
>>_variety_ of players.
>>
>
>Sure, more games is better.  5 games definitely isn't enough if you happen to
>pick a set of exceptionally good/bad games.  I guess my main point is that for a
>given samples size, a GM will do a MUCH better job of estimating playing
>strength than the ELO formula.
>
>
>>A friend of mine, about 2100 strength had a record of 5-0 (slow OTB tournament
>>play) against IM Kamran Shirazi (2550-2600 strngth). Their respective styles
>>were such that he would beat the crap out of him in every game. Luck had nothing
>>to do wih it. In a sixth game, he was crushing him also, but his habitual time
>>trouble allowed Shirazi to limp away with a draw. My friend was not a very good
>>blitz player and spoiled a lot of games in the move 30-40 range.
>>
>>I think you must conclude that 4-5 games are not enough or my friend is as
>>strong as Kasparov. Which is it?
>>
>
>I conclude that 4-5 games isn't always enough, especially when they are not
>selected randomly.  :)
>
>By the way, the ELO system *would* say that your friend is World Champion
>strength based on those 4-5 games.  A good GM would realize this is not the
>case.
>
Nope! The Elo system says he is 2100. Those games are from his personal record
against this one particular player. They are from distinct tournaments spread
out over a couple of years or so. The Elo system says he is 2100 strength and he
is 2100 strength. It does not matter if the GM agrees or not. It is an objective
measure. You can't argue against an objective measure. A GM assessment is not an
objective measure, so it can be and should be argued against. It is not
scientific. It never will be. Remember, GMs disagree all the time. Which GM is
right?

>>Chess playing programs can be similar. The respective opening books can slant
>>the outcome greatly in one direction or the other. Using the same opening book
>>does not help, since the types of positions resulting may be limited and slant
>>things greatly in favor of one program.
>>
>>To determine strength accurately, a player, computer or human, needs to be
>>tested against a random sample from a _population_ of players. This is what a
>>book on statistics will tell you.
>>
>
>I really do understand the statistics.  Perhaps my examples were a bit extreme,
>but I also think you missed my main point...
>
>--Peter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.