Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: about statistics, and junior in bilbao, for graham laight

Author: Vasik Rajlich

Date: 06:44:13 10/16/04

Go up one level in this thread


On October 16, 2004 at 04:18:06, martin fierz wrote:

>hi graham!
>
>in the last days you suggested that junior seriously underperformed in bilbao
>and even wrote a small program to prove your point. you were quite undeterred by
>all the people saying "too little games" because you were looking at the results
>your simulator gave you. i'd like to explain why your argument is flawed, and i
>will use your little program to do it :-)
>
>let's see, i will take the probabilities to win, lose and draw for the average
>computer player to be 50%, 40% and 10% (that is my 'best estimate' based on the
>actual results).
>
>what do i get:
>DJ won 0 points in 0.02% of the tournaments
>DJ won 0.5 points in 0.18% of the tournaments
>DJ won 1 points in 1.16% of the tournaments
>DJ won 1.5 points in 5.23% of the tournaments
>DJ won 2 points in 14.09% of the tournaments
>DJ won 2.5 points in 24.73% of the tournaments
>DJ won 3 points in 29.12% of the tournaments
>DJ won 3.5 points in 19.21% of the tournaments
>DJ won 4 points in 6.26% of the tournaments
>
>now, the disagreement begins as to what these numbers mean. you are implying
>that the above numbers indicate that DJ has a very, very low probability of
>scoring only 1.5 points. that in itself is quite true, but *every* single result
>is rather unlikely. what you really need to do is compare the most likely
>outcome (scoring 3 points) against the actual outcome (if you believe that the
>underlying winning probabilities are the truth). and NOT compare every single
>result vs 100%!
>
>so: most probable outcome would be all computers score 3 points, with a joint
>probability of this happening being (0.2912)^3 = 0.0247 = 2.5%
>the actual outcome had a probability of (0.1921)^2*(0.0523) = 0.0019 = 0.2%.
>
>these numbers show: the probability of any SINGLE result is very low - even the
>most probable result only happens in 2.5% of all cases. the probability of the
>actual result happening is 13 times smaller. in this sense, if you want to stick
>to your hypothesis that all computers were of similar strength, then this was a
>slightly unusual result. but it was most definitely NOT an improbable result.
>your mistake seems to be that you take the probability of a result occurring,
>and compare it to 1 ("0.2% is very unlikely - 1 in 500"). instead, you have to
>compare it with the probabilty of the most likely result occurring, and then
>things don't look improbable at all (0.2% vs 2.5% - 1 in 13). did i make this
>point clear enough?
>
>
>now, with all this said and done, the result gets even more likely if you factor
>in the playing strength of the humans. the match was very weird in the sense
>that they had 4 rounds for 3 players each, so one program had to play one human
>twice. bad luck for junior, it had to play topalov twice. he was the
>highest-rated human of the lot, and he just came back from a stunning
>performance at the fide world chess championship. david levy writes
>(http://www.chessbase.com/newsdetail.asp?newsid=1956)
>
>"But whatever the level of preparation of team GM it did not show itself to good
>effect in most of the games, although Topalov appeared to have a much better
>understanding of how computers play chess than did either of his team-mates."
>
>so topalov was the highest-rated + best prepared for this competition according
>to levy (and he knows a bit something about both chess and computer chess). if i
>take the 1-in-13 chance of the actual result happening, and add that topalov was
>the strongest player on the human side, that will make the actual result more
>probable of course, at least 1-in-10 i would guess compared to a "most likely"
>result. now i don't call that unlikely. do you?
>
>cheers
>  martin

You're right that the argument that "the chance of this result is only 5.23%" is
bogus. The right form of that argument is: the chance of this result (1.5/4) or
less is: 0.02% + 0.18% + 1.16% + 5.23%. That's still a pretty low number.

In addition, the chance of Fritz and Hydra scoring 7/8 (or more) given your
estimate of 50%, 40% and 10% would be an even lower number. (In fact I think
your numbers are too optimistic for the humans - unfortunately.)

If you consider Junior to be able to stand humans as well as Fritz and Hydra,
what happened was an extremely anomalous result.

Vas



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.