Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: EVIDENCE That Junior REALLY DID Perform Very Badly In Bilbao

Author: Sandro Necchi

Date: 13:13:48 10/14/04

Go up one level in this thread


On October 13, 2004 at 15:51:02, Robert Hyatt wrote:

>On October 13, 2004 at 11:11:47, Graham Laight wrote:
>
>>On October 13, 2004 at 10:55:20, Michael Yee wrote:
>>
>>>On October 13, 2004 at 10:42:08, Graham Laight wrote:
>>>>On October 13, 2004 at 10:33:30, Michael Yee wrote:
>>>
>>>>>have 1 "bad" (or underperforming) tournament out of 20, i.e., with low
>>>>>probability. But the rare event *will* (or could) happen at some point.
>>>>
>>>>Please see the answer I gave in
>>>>http://www.talkchess.com/forums/1/message.html?391399
>>>>
>>>>-g
>>>>
>>>>>Michael
>>>
>>>No offense, but I don't think I understand what your point is. Your simulation
>>
>>My points (made throughout the thread - not just in the previous post in this
>>branch of the thread) are:
>>
>>1. Given the Hydra and Fritz results, the Junior result is unexpectedly low
>
>
>What would you do if you took four humans, and four copies of fritz or hydra and
>played the _same_ event again?  And what would you say if one of the copies of
>Fritz produced 3 draws and a loss?  "It did poorly?"  Or "unexpected random
>chance?"
>
>It is almost a certainty that all 4 copies would _not_ produce the same
>result...
>
Bob,

you are correct but we are "old fashion". I hope you do not get upset; I mean we
deeply analyze things and try to give explanations to things.

The "modern" way is to give very quick estimantions/evaluations based on scores
on limited amount of games and or events.

This is why one program can go down from top to lowest level and the other way
around so easily. This happens on sports too.

Of course not everybody think this way, but more and more people seems to do it
probably because to understand things need a lot of specific knowledge and
experience which require a lot of time and people do not have or are not willing
to invest the needed time. So it is easier to make fast comments; not so easy to
make deep analysis...

Sandro
>
>>
>>2. The Hydra and Fritz results taken together are an indication of great
>>strength
>>
>>>(or even just a basic probability calculation) shows that a "low" score for an
>>>engine that is assumed to have a certain strength is a rare event. I don't
>>>disagree with that. I'm just confused about what conclusions you're trying to
>>>draw from witnessing a rare event.
>>>
>>>Here's how I might put bilbao in perspective: Suppose we are looking at this
>>>tournament as simply one in a stream of tournaments, and we consider updating
>>>junior's rating (i.e., strength estimate) in a bayesian way. Then junior's past
>>>results would weigh much more heavily than this one new result and the rating
>>>wouldn't change by much.
>>>
>>>What would I conclude? Probably that junior had a (slightly) rare result.
>>
>>The Junior result is probably not too far away from what you'd expect. Perhaps I
>>have been looking in astonishment at the wrong place. Perhaps the astonishment
>>should be focused upon the 7/8 score which Hydra and Fritz achieved - which is
>>highly improbable (I calculated 1/160 in another post in this thread) unless
>>these two computers are substantially better than the opponents that they faced
>>at Bilbao.
>>
>>-g
>>
>>>Michael



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.