Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: I just don't get this ...

Author: Daniel Clausen

Date: 05:48:26 01/06/04

Go up one level in this thread


On January 05, 2004 at 19:52:14, Uri Blass wrote:

>On January 05, 2004 at 19:23:32, Christophe Theron wrote:
>
>>On January 05, 2004 at 14:41:11, Uri Blass wrote:
>>
>>>On January 05, 2004 at 13:57:05, Christophe Theron wrote:
>>>
>>>>On January 05, 2004 at 03:14:48, Uri Blass wrote:
>>>>
>>>>>On January 04, 2004 at 23:04:55, Christophe Theron wrote:
>>>>>
>>>>>>On January 04, 2004 at 11:10:41, Bob Durrett wrote:
>>>>>>
>>>>>>>On January 04, 2004 at 11:00:31, Dan Andersson wrote:
>>>>>>>
>>>>>>>> I admire your persistance. I guess most of us that have a mathematical
>>>>>>>>statistics education got tired explaining things after the first thread or so.
>>>>>>>>
>>>>>>>>MvH Dan Andersson
>>>>>>>
>>>>>>>I, too, have a "mathematical statistics education."
>>>>>>>
>>>>>>>What bugs me is that all of the CCC bulletins seem to suggest that those who run
>>>>>>>and evaluate tournaments look only at the win/loss statistics.  There is
>>>>>>>considerably more information in a game score than just the final game result.
>>>>>>>
>>>>>>>Throwing away useful information is what I call "blind adherence to statistics."
>>>>>>> One needs to rise above one's formal education and supplement it with good
>>>>>>>thinking.
>>>>>>>
>>>>>>>: )
>>>>>>>
>>>>>>>Bob D.
>>>>>>
>>>>>>
>>>>>>
>>>>>>The games themselves do not contain more information about the relative strength
>>>>>>of the opponents than the bare winning percentage of the winner.
>>>>>>
>>>>>>That should not be forgotten.
>>>>>>
>>>>>>
>>>>>>
>>>>>>    Christophe
>>>>>
>>>>>No
>>>>>
>>>>>The games have more information but the problem is how to interpret them.
>>>>>Let give an extreme example when it is easy to learn from only 2 games who is
>>>>>better.
>>>>>
>>>>>Suppose that the loser program in both games does mistakes that 2 ply search can
>>>>>avoid and not one mistake.
>>>>>
>>>>>First it is losing a pawn and later it is losing a knight and later the queen
>>>>>and finally it is checkmated.
>>>>>
>>>>>Suppose that it happens in 2 games.
>>>>>
>>>>>You can say that the winner is better based only on the games but you cannot say
>>>>>it based only on the results.
>>>>>
>>>>>Usually learning from the games is harder but it does not mean that they have
>>>>>not more information then the results.
>>>>>
>>>>>Uri
>>>>
>>>>
>>>>
>>>>No Uri. I could write a program that once in a while, randomly, would throw a
>>>>game away by doing only a 2-plies search. Then in the other games it would play
>>>>at full strength.
>>>>
>>>>You would be wrong assuming anything about the program by looking at those 2
>>>>lost games.
>>>>
>>>>You would still have to play a large number of games to assess its real
>>>>strength.
>>>>
>>>>This example is not unreal anyway: it's what happens when a program has a bug
>>>>that make it throw away certain games.
>>>
>>>It can happen in one game but when it happens in 2 consecutive games then I
>>>assume that it is probably not because of a bug.
>>
>>
>>
>>You cannot be sure. As I said, it is possible to write a program that will be
>>unpredictably weak sometimes. Bugs can simulate this behaviour.
>>
>>
>>
>>
>>>>The content of the games themselves does not tell you anything more about the
>>>>relative playing strength of the opponents than the winning percentage.
>>>
>>>You can never be 100% sure but the content of the games can change the
>>>confidence that you believe in something.
>>>
>>>>
>>>>From a mathematical point of view, it's obvious anyway: the formula to compute
>>>>the relative playing strength of two opponents does not take into account the
>>>>contents of the games themselves. It just takes into account the number of wins,
>>>>draws and losses. Or just the winning percentage.
>>>
>>>The problem is that it is not simple to have a formula that use the content of
>>>the games but it does not mean that the content of the games has no meaning.
>>>
>>>If you add knowledge about pawn endgames without speed reduction and you see
>>>that the program play the same in most positions and better in pawn endgame you
>>>can learn that the program is probably better and testing by thousands of games
>>>is simply a waste of time.
>>>
>>>Uri
>>
>>
>>
>>You will never be sure unless you play those thousands of games.
>>
>>Looking at the contents of the games can change your beliefs or impressions, but
>> only the final winning percentage tells you the real elo difference (or a more
>>or less accurate estimation of it depending on the confidence interval you
>>wish).
>>
>>I'm not telling you that you should not look at what happens in the games. It is
>>very important to look at what happens in order to find what should be improved
>>for example. I'm telling you that what happens does not give you more
>>information about the elo difference than the winning percentage.
>>
>>
>>
>>    Christophe
>
>What happens also give me more information about the elo difference.
>
>The fact that I do not know to express it in a mathematical formula does not
>mean that there is no more information.
>
>If a new version declare mate when there is no mate or crash or play stupid
>blunders in the first game then I know that I have a bug that I need to fix and
>it is probably significantly weaker than a stable version.
>
>I do not need to give it to play many games and watch it crash a lot of times in
>order to understand it.
>
>You can say that I cannot be sure and it is possible that the new version is
>better and the crash is something rare that almost never happens and I have not
>a model of probability that tells me what is the probability that it is better
>but it does not prevent me to say that I am almost sure that is weaker.
>
>Uri

After reading many posts in this thread I'm pretty certain that the discussion
goes no-where. ;)

Sargon



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.