Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: about statistics, and junior in bilbao, for graham laight

Author: martin fierz

Date: 07:13:53 10/16/04

Go up one level in this thread


On October 16, 2004 at 09:44:13, Vasik Rajlich wrote:

>On October 16, 2004 at 04:18:06, martin fierz wrote:
>
>>hi graham!
>>
>>in the last days you suggested that junior seriously underperformed in bilbao
>>and even wrote a small program to prove your point. you were quite undeterred by
>>all the people saying "too little games" because you were looking at the results
>>your simulator gave you. i'd like to explain why your argument is flawed, and i
>>will use your little program to do it :-)
>>
>>let's see, i will take the probabilities to win, lose and draw for the average
>>computer player to be 50%, 40% and 10% (that is my 'best estimate' based on the
>>actual results).
>>
>>what do i get:
>>DJ won 0 points in 0.02% of the tournaments
>>DJ won 0.5 points in 0.18% of the tournaments
>>DJ won 1 points in 1.16% of the tournaments
>>DJ won 1.5 points in 5.23% of the tournaments
>>DJ won 2 points in 14.09% of the tournaments
>>DJ won 2.5 points in 24.73% of the tournaments
>>DJ won 3 points in 29.12% of the tournaments
>>DJ won 3.5 points in 19.21% of the tournaments
>>DJ won 4 points in 6.26% of the tournaments
>>
>>now, the disagreement begins as to what these numbers mean. you are implying
>>that the above numbers indicate that DJ has a very, very low probability of
>>scoring only 1.5 points. that in itself is quite true, but *every* single result
>>is rather unlikely. what you really need to do is compare the most likely
>>outcome (scoring 3 points) against the actual outcome (if you believe that the
>>underlying winning probabilities are the truth). and NOT compare every single
>>result vs 100%!
>>
>>so: most probable outcome would be all computers score 3 points, with a joint
>>probability of this happening being (0.2912)^3 = 0.0247 = 2.5%
>>the actual outcome had a probability of (0.1921)^2*(0.0523) = 0.0019 = 0.2%.
>>
>>these numbers show: the probability of any SINGLE result is very low - even the
>>most probable result only happens in 2.5% of all cases. the probability of the
>>actual result happening is 13 times smaller. in this sense, if you want to stick
>>to your hypothesis that all computers were of similar strength, then this was a
>>slightly unusual result. but it was most definitely NOT an improbable result.
>>your mistake seems to be that you take the probability of a result occurring,
>>and compare it to 1 ("0.2% is very unlikely - 1 in 500"). instead, you have to
>>compare it with the probabilty of the most likely result occurring, and then
>>things don't look improbable at all (0.2% vs 2.5% - 1 in 13). did i make this
>>point clear enough?
>>
>>
>>now, with all this said and done, the result gets even more likely if you factor
>>in the playing strength of the humans. the match was very weird in the sense
>>that they had 4 rounds for 3 players each, so one program had to play one human
>>twice. bad luck for junior, it had to play topalov twice. he was the
>>highest-rated human of the lot, and he just came back from a stunning
>>performance at the fide world chess championship. david levy writes
>>(http://www.chessbase.com/newsdetail.asp?newsid=1956)
>>
>>"But whatever the level of preparation of team GM it did not show itself to good
>>effect in most of the games, although Topalov appeared to have a much better
>>understanding of how computers play chess than did either of his team-mates."
>>
>>so topalov was the highest-rated + best prepared for this competition according
>>to levy (and he knows a bit something about both chess and computer chess). if i
>>take the 1-in-13 chance of the actual result happening, and add that topalov was
>>the strongest player on the human side, that will make the actual result more
>>probable of course, at least 1-in-10 i would guess compared to a "most likely"
>>result. now i don't call that unlikely. do you?
>>
>>cheers
>>  martin
>
>You're right that the argument that "the chance of this result is only 5.23%" is
>bogus. The right form of that argument is: the chance of this result (1.5/4) or
>less is: 0.02% + 0.18% + 1.16% + 5.23%. That's still a pretty low number.

no, no, that is not what i mean. the argument is not about this at all.

[snip]

>If you consider Junior to be able to stand humans as well as Fritz and Hydra,
>what happened was an extremely anomalous result.

no, it's not, that was what that post was all about. it's a 1-in-10 chance or
so. which i would definitely not call "extremely anomalous". would you?

the whole argument is about the following: take a dice. roll it 3 times. you
get, for argument's sake, the sequence 1,1,1. you stop and wonder - "wow, this
was really unlikely" - you know some maths, and you go and calculate that the
chance of this thing happening was 1 in 216. so you say "wow, this was really
unlikely, 1 in 216, i proved it with numbers". that is grahams argument
translated to a dice, and it is totally wrong, since you cannot do postmortem
analysis of probabilities. ANY 3-number-sequence would have been exactly as
unlikely as 1,1,1. and that is the point.
even if the likelihood of 1,1,1 occurring was only 1 in 216, it was EQUALLY
likely to appear as any other sequence.

so, even if you assume that the machines all had equal winning chances in bilbao
(wrong, since junior faced the strongest opposition), then what happened in
bilbao had a decent chance to happen, because COMPARED TO THE MOST LIKELY
OUTCOME it was still quite a likely outcome (10 times less likely than the most
likely outcome).

i can't make it any clearer than this, i'm afraid. but at least i tried, and
didn't just say "not enough games", which is of course quite correct :-)

cheers
  martin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.