Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: I just don't get this ...

Author: Uri Blass

Date: 16:52:14 01/05/04

Go up one level in this thread


On January 05, 2004 at 19:23:32, Christophe Theron wrote:

>On January 05, 2004 at 14:41:11, Uri Blass wrote:
>
>>On January 05, 2004 at 13:57:05, Christophe Theron wrote:
>>
>>>On January 05, 2004 at 03:14:48, Uri Blass wrote:
>>>
>>>>On January 04, 2004 at 23:04:55, Christophe Theron wrote:
>>>>
>>>>>On January 04, 2004 at 11:10:41, Bob Durrett wrote:
>>>>>
>>>>>>On January 04, 2004 at 11:00:31, Dan Andersson wrote:
>>>>>>
>>>>>>> I admire your persistance. I guess most of us that have a mathematical
>>>>>>>statistics education got tired explaining things after the first thread or so.
>>>>>>>
>>>>>>>MvH Dan Andersson
>>>>>>
>>>>>>I, too, have a "mathematical statistics education."
>>>>>>
>>>>>>What bugs me is that all of the CCC bulletins seem to suggest that those who run
>>>>>>and evaluate tournaments look only at the win/loss statistics.  There is
>>>>>>considerably more information in a game score than just the final game result.
>>>>>>
>>>>>>Throwing away useful information is what I call "blind adherence to statistics."
>>>>>> One needs to rise above one's formal education and supplement it with good
>>>>>>thinking.
>>>>>>
>>>>>>: )
>>>>>>
>>>>>>Bob D.
>>>>>
>>>>>
>>>>>
>>>>>The games themselves do not contain more information about the relative strength
>>>>>of the opponents than the bare winning percentage of the winner.
>>>>>
>>>>>That should not be forgotten.
>>>>>
>>>>>
>>>>>
>>>>>    Christophe
>>>>
>>>>No
>>>>
>>>>The games have more information but the problem is how to interpret them.
>>>>Let give an extreme example when it is easy to learn from only 2 games who is
>>>>better.
>>>>
>>>>Suppose that the loser program in both games does mistakes that 2 ply search can
>>>>avoid and not one mistake.
>>>>
>>>>First it is losing a pawn and later it is losing a knight and later the queen
>>>>and finally it is checkmated.
>>>>
>>>>Suppose that it happens in 2 games.
>>>>
>>>>You can say that the winner is better based only on the games but you cannot say
>>>>it based only on the results.
>>>>
>>>>Usually learning from the games is harder but it does not mean that they have
>>>>not more information then the results.
>>>>
>>>>Uri
>>>
>>>
>>>
>>>No Uri. I could write a program that once in a while, randomly, would throw a
>>>game away by doing only a 2-plies search. Then in the other games it would play
>>>at full strength.
>>>
>>>You would be wrong assuming anything about the program by looking at those 2
>>>lost games.
>>>
>>>You would still have to play a large number of games to assess its real
>>>strength.
>>>
>>>This example is not unreal anyway: it's what happens when a program has a bug
>>>that make it throw away certain games.
>>
>>It can happen in one game but when it happens in 2 consecutive games then I
>>assume that it is probably not because of a bug.
>
>
>
>You cannot be sure. As I said, it is possible to write a program that will be
>unpredictably weak sometimes. Bugs can simulate this behaviour.
>
>
>
>
>>>The content of the games themselves does not tell you anything more about the
>>>relative playing strength of the opponents than the winning percentage.
>>
>>You can never be 100% sure but the content of the games can change the
>>confidence that you believe in something.
>>
>>>
>>>From a mathematical point of view, it's obvious anyway: the formula to compute
>>>the relative playing strength of two opponents does not take into account the
>>>contents of the games themselves. It just takes into account the number of wins,
>>>draws and losses. Or just the winning percentage.
>>
>>The problem is that it is not simple to have a formula that use the content of
>>the games but it does not mean that the content of the games has no meaning.
>>
>>If you add knowledge about pawn endgames without speed reduction and you see
>>that the program play the same in most positions and better in pawn endgame you
>>can learn that the program is probably better and testing by thousands of games
>>is simply a waste of time.
>>
>>Uri
>
>
>
>You will never be sure unless you play those thousands of games.
>
>Looking at the contents of the games can change your beliefs or impressions, but
> only the final winning percentage tells you the real elo difference (or a more
>or less accurate estimation of it depending on the confidence interval you
>wish).
>
>I'm not telling you that you should not look at what happens in the games. It is
>very important to look at what happens in order to find what should be improved
>for example. I'm telling you that what happens does not give you more
>information about the elo difference than the winning percentage.
>
>
>
>    Christophe

What happens also give me more information about the elo difference.

The fact that I do not know to express it in a mathematical formula does not
mean that there is no more information.

If a new version declare mate when there is no mate or crash or play stupid
blunders in the first game then I know that I have a bug that I need to fix and
it is probably significantly weaker than a stable version.

I do not need to give it to play many games and watch it crash a lot of times in
order to understand it.

You can say that I cannot be sure and it is possible that the new version is
better and the crash is something rare that almost never happens and I have not
a model of probability that tells me what is the probability that it is better
but it does not prevent me to say that I am almost sure that is weaker.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.