Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: rebel 10~!! super strong on amd k62 500

Author: Peter Kappler

Date: 09:34:06 07/28/00

Go up one level in this thread


On July 28, 2000 at 06:33:57, Ricardo Gibert wrote:

>On July 28, 2000 at 02:12:50, Peter Kappler wrote:
>
>>On July 28, 2000 at 01:23:55, Dann Corbit wrote:
>>
>>>On July 28, 2000 at 01:15:46, Peter Kappler wrote:
>>>
>>>>On July 28, 2000 at 00:50:09, Ratko V Tomic wrote:
>>>>
>>>>>Well, you're unjust to Thorsten. The rating calculations
>>>>>extract very little data from each game, about 1.58 bits
>>>>>per game (i.e. log2(3)). On the other hand, each ply contains
>>>>>about 5-6 bits of data, or for a 100 ply game you have 500
>>>>>bits of data produced. Hence the conventional rating tests
>>>>>based on the 3-way game result are very highly inefficient,
>>>>>they keep about 0.3 percent of info produced in game.
>>>>
>>>>Why 5-6 bits per ply?  Just enough to represent an appoximate evaluation of the
>>>>position?
>>>>
>>>>>
>>>>>The advantage of ratings to the more efficient information
>>>>>extractors (such as human brain) is that one can compute
>>>>>such rating without even knowing how to play chess. Another
>>>>>advantage is that they're not biased by human subjective judgment
>>>>>(the ratings may manifest other biases which reduce their
>>>>>predictive power, especially when extrapolating to a new opponent
>>>>>from a small number of earlier opponents). A human chess player
>>>>>likely extracts 100 times more info per game than the mechanical
>>>>>rating calculator, and the stronger the player the more info he
>>>>>can extract.
>>>>>
>>>><snip>
>>>>
>>>>
>>>>Well said.  I have always felt this way, and seeing the idea explained so
>>>>eloquently is comforting in a strange way. :)
>>>
>>>I don't believe it for a minute.
>>>
>>>I have seen too many times when someone is completely wrong in their assessments
>>>to fall for it.
>>
>>
>>What he says makes more sense if you assume a strong player is making the
>>assessments.
>>
>>I'd venture that a GM can estimate a player's rating to within +/- 200 points by
>>just analyzing one game.  I think the success rate would be at least 80%.
>>
>No. I'm over 2200 USCF and I don't think this is a good way to estimate a
>players ability. There are several reasons why I think this. Some based on
>practical experience and some based on my understanding of statistics.
>

OK, but I'm around 2100 USCF, so I think my opinion counts, too.  :)


>I remember playing an A-player in a tournament and I he was able to create an
>incredible amount of pressure in the middlegame. He kept finding incredible
>moves I thought no A-player could find. I was barely able to survive and had
>come to the conclusion he was way under-rated and that a draw would be a good
>result for me, despite the rating differential.
>
>I made it to an endgame with meager chances to draw. That was when the "strong
>player" vanished and he started to play like a C-player. He didn't blow the game
>in one move. He made a series of weak moves to blow the game and I wound up
>winning!
>

Remember that I said I thought the GM's success rate would only be 80% given a
one game sample.  You'll always be able to pick an "outlier" game where a player
performed well above or below their true strength.  (Though, even in your
example above, the guy finally showed his true colors at the end.)

>He was 2400 strength in the _particular_ middlegame we played, but only 1500
>strength in the endgame. This was a player with _big_ holes in his make-up as a
>player. A lot of strong players would have folded up in the middlegame and come
>away with the impression this guy was super strong.
>
>Another possibility is a different kind of middlegame (a closed position) would
>have revealed his weaknesses as a player. It all depends on the player.
>
>>And if you gave him 4 or 5 games to analyze, I'd probably have more faith in the
>>GM's estimate than the player's actual rating.  :)
>>
>No. It is even possible in 4 or 5 games that a player is able to get positions
>that complement his playing style and he looks like he can do no wrong. There is
>no substitute for an objective assessment using a large number games against a
>_variety_ of players.
>

Sure, more games is better.  5 games definitely isn't enough if you happen to
pick a set of exceptionally good/bad games.  I guess my main point is that for a
given samples size, a GM will do a MUCH better job of estimating playing
strength than the ELO formula.


>A friend of mine, about 2100 strength had a record of 5-0 (slow OTB tournament
>play) against IM Kamran Shirazi (2550-2600 strngth). Their respective styles
>were such that he would beat the crap out of him in every game. Luck had nothing
>to do wih it. In a sixth game, he was crushing him also, but his habitual time
>trouble allowed Shirazi to limp away with a draw. My friend was not a very good
>blitz player and spoiled a lot of games in the move 30-40 range.
>
>I think you must conclude that 4-5 games are not enough or my friend is as
>strong as Kasparov. Which is it?
>

I conclude that 4-5 games isn't always enough, especially when they are not
selected randomly.  :)

By the way, the ELO system *would* say that your friend is World Champion
strength based on those 4-5 games.  A good GM would realize this is not the
case.

>Chess playing programs can be similar. The respective opening books can slant
>the outcome greatly in one direction or the other. Using the same opening book
>does not help, since the types of positions resulting may be limited and slant
>things greatly in favor of one program.
>
>To determine strength accurately, a player, computer or human, needs to be
>tested against a random sample from a _population_ of players. This is what a
>book on statistics will tell you.
>

I really do understand the statistics.  Perhaps my examples were a bit extreme,
but I also think you missed my main point...

--Peter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.