Author: Daniel Clausen
Date: 05:48:26 01/06/04
Go up one level in this thread
On January 05, 2004 at 19:52:14, Uri Blass wrote: >On January 05, 2004 at 19:23:32, Christophe Theron wrote: > >>On January 05, 2004 at 14:41:11, Uri Blass wrote: >> >>>On January 05, 2004 at 13:57:05, Christophe Theron wrote: >>> >>>>On January 05, 2004 at 03:14:48, Uri Blass wrote: >>>> >>>>>On January 04, 2004 at 23:04:55, Christophe Theron wrote: >>>>> >>>>>>On January 04, 2004 at 11:10:41, Bob Durrett wrote: >>>>>> >>>>>>>On January 04, 2004 at 11:00:31, Dan Andersson wrote: >>>>>>> >>>>>>>> I admire your persistance. I guess most of us that have a mathematical >>>>>>>>statistics education got tired explaining things after the first thread or so. >>>>>>>> >>>>>>>>MvH Dan Andersson >>>>>>> >>>>>>>I, too, have a "mathematical statistics education." >>>>>>> >>>>>>>What bugs me is that all of the CCC bulletins seem to suggest that those who run >>>>>>>and evaluate tournaments look only at the win/loss statistics. There is >>>>>>>considerably more information in a game score than just the final game result. >>>>>>> >>>>>>>Throwing away useful information is what I call "blind adherence to statistics." >>>>>>> One needs to rise above one's formal education and supplement it with good >>>>>>>thinking. >>>>>>> >>>>>>>: ) >>>>>>> >>>>>>>Bob D. >>>>>> >>>>>> >>>>>> >>>>>>The games themselves do not contain more information about the relative strength >>>>>>of the opponents than the bare winning percentage of the winner. >>>>>> >>>>>>That should not be forgotten. >>>>>> >>>>>> >>>>>> >>>>>> Christophe >>>>> >>>>>No >>>>> >>>>>The games have more information but the problem is how to interpret them. >>>>>Let give an extreme example when it is easy to learn from only 2 games who is >>>>>better. >>>>> >>>>>Suppose that the loser program in both games does mistakes that 2 ply search can >>>>>avoid and not one mistake. >>>>> >>>>>First it is losing a pawn and later it is losing a knight and later the queen >>>>>and finally it is checkmated. >>>>> >>>>>Suppose that it happens in 2 games. >>>>> >>>>>You can say that the winner is better based only on the games but you cannot say >>>>>it based only on the results. >>>>> >>>>>Usually learning from the games is harder but it does not mean that they have >>>>>not more information then the results. >>>>> >>>>>Uri >>>> >>>> >>>> >>>>No Uri. I could write a program that once in a while, randomly, would throw a >>>>game away by doing only a 2-plies search. Then in the other games it would play >>>>at full strength. >>>> >>>>You would be wrong assuming anything about the program by looking at those 2 >>>>lost games. >>>> >>>>You would still have to play a large number of games to assess its real >>>>strength. >>>> >>>>This example is not unreal anyway: it's what happens when a program has a bug >>>>that make it throw away certain games. >>> >>>It can happen in one game but when it happens in 2 consecutive games then I >>>assume that it is probably not because of a bug. >> >> >> >>You cannot be sure. As I said, it is possible to write a program that will be >>unpredictably weak sometimes. Bugs can simulate this behaviour. >> >> >> >> >>>>The content of the games themselves does not tell you anything more about the >>>>relative playing strength of the opponents than the winning percentage. >>> >>>You can never be 100% sure but the content of the games can change the >>>confidence that you believe in something. >>> >>>> >>>>From a mathematical point of view, it's obvious anyway: the formula to compute >>>>the relative playing strength of two opponents does not take into account the >>>>contents of the games themselves. It just takes into account the number of wins, >>>>draws and losses. Or just the winning percentage. >>> >>>The problem is that it is not simple to have a formula that use the content of >>>the games but it does not mean that the content of the games has no meaning. >>> >>>If you add knowledge about pawn endgames without speed reduction and you see >>>that the program play the same in most positions and better in pawn endgame you >>>can learn that the program is probably better and testing by thousands of games >>>is simply a waste of time. >>> >>>Uri >> >> >> >>You will never be sure unless you play those thousands of games. >> >>Looking at the contents of the games can change your beliefs or impressions, but >> only the final winning percentage tells you the real elo difference (or a more >>or less accurate estimation of it depending on the confidence interval you >>wish). >> >>I'm not telling you that you should not look at what happens in the games. It is >>very important to look at what happens in order to find what should be improved >>for example. I'm telling you that what happens does not give you more >>information about the elo difference than the winning percentage. >> >> >> >> Christophe > >What happens also give me more information about the elo difference. > >The fact that I do not know to express it in a mathematical formula does not >mean that there is no more information. > >If a new version declare mate when there is no mate or crash or play stupid >blunders in the first game then I know that I have a bug that I need to fix and >it is probably significantly weaker than a stable version. > >I do not need to give it to play many games and watch it crash a lot of times in >order to understand it. > >You can say that I cannot be sure and it is possible that the new version is >better and the crash is something rare that almost never happens and I have not >a model of probability that tells me what is the probability that it is better >but it does not prevent me to say that I am almost sure that is weaker. > >Uri After reading many posts in this thread I'm pretty certain that the discussion goes no-where. ;) Sargon
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.