Author: Uri Blass
Date: 16:52:14 01/05/04
Go up one level in this thread
On January 05, 2004 at 19:23:32, Christophe Theron wrote: >On January 05, 2004 at 14:41:11, Uri Blass wrote: > >>On January 05, 2004 at 13:57:05, Christophe Theron wrote: >> >>>On January 05, 2004 at 03:14:48, Uri Blass wrote: >>> >>>>On January 04, 2004 at 23:04:55, Christophe Theron wrote: >>>> >>>>>On January 04, 2004 at 11:10:41, Bob Durrett wrote: >>>>> >>>>>>On January 04, 2004 at 11:00:31, Dan Andersson wrote: >>>>>> >>>>>>> I admire your persistance. I guess most of us that have a mathematical >>>>>>>statistics education got tired explaining things after the first thread or so. >>>>>>> >>>>>>>MvH Dan Andersson >>>>>> >>>>>>I, too, have a "mathematical statistics education." >>>>>> >>>>>>What bugs me is that all of the CCC bulletins seem to suggest that those who run >>>>>>and evaluate tournaments look only at the win/loss statistics. There is >>>>>>considerably more information in a game score than just the final game result. >>>>>> >>>>>>Throwing away useful information is what I call "blind adherence to statistics." >>>>>> One needs to rise above one's formal education and supplement it with good >>>>>>thinking. >>>>>> >>>>>>: ) >>>>>> >>>>>>Bob D. >>>>> >>>>> >>>>> >>>>>The games themselves do not contain more information about the relative strength >>>>>of the opponents than the bare winning percentage of the winner. >>>>> >>>>>That should not be forgotten. >>>>> >>>>> >>>>> >>>>> Christophe >>>> >>>>No >>>> >>>>The games have more information but the problem is how to interpret them. >>>>Let give an extreme example when it is easy to learn from only 2 games who is >>>>better. >>>> >>>>Suppose that the loser program in both games does mistakes that 2 ply search can >>>>avoid and not one mistake. >>>> >>>>First it is losing a pawn and later it is losing a knight and later the queen >>>>and finally it is checkmated. >>>> >>>>Suppose that it happens in 2 games. >>>> >>>>You can say that the winner is better based only on the games but you cannot say >>>>it based only on the results. >>>> >>>>Usually learning from the games is harder but it does not mean that they have >>>>not more information then the results. >>>> >>>>Uri >>> >>> >>> >>>No Uri. I could write a program that once in a while, randomly, would throw a >>>game away by doing only a 2-plies search. Then in the other games it would play >>>at full strength. >>> >>>You would be wrong assuming anything about the program by looking at those 2 >>>lost games. >>> >>>You would still have to play a large number of games to assess its real >>>strength. >>> >>>This example is not unreal anyway: it's what happens when a program has a bug >>>that make it throw away certain games. >> >>It can happen in one game but when it happens in 2 consecutive games then I >>assume that it is probably not because of a bug. > > > >You cannot be sure. As I said, it is possible to write a program that will be >unpredictably weak sometimes. Bugs can simulate this behaviour. > > > > >>>The content of the games themselves does not tell you anything more about the >>>relative playing strength of the opponents than the winning percentage. >> >>You can never be 100% sure but the content of the games can change the >>confidence that you believe in something. >> >>> >>>From a mathematical point of view, it's obvious anyway: the formula to compute >>>the relative playing strength of two opponents does not take into account the >>>contents of the games themselves. It just takes into account the number of wins, >>>draws and losses. Or just the winning percentage. >> >>The problem is that it is not simple to have a formula that use the content of >>the games but it does not mean that the content of the games has no meaning. >> >>If you add knowledge about pawn endgames without speed reduction and you see >>that the program play the same in most positions and better in pawn endgame you >>can learn that the program is probably better and testing by thousands of games >>is simply a waste of time. >> >>Uri > > > >You will never be sure unless you play those thousands of games. > >Looking at the contents of the games can change your beliefs or impressions, but > only the final winning percentage tells you the real elo difference (or a more >or less accurate estimation of it depending on the confidence interval you >wish). > >I'm not telling you that you should not look at what happens in the games. It is >very important to look at what happens in order to find what should be improved >for example. I'm telling you that what happens does not give you more >information about the elo difference than the winning percentage. > > > > Christophe What happens also give me more information about the elo difference. The fact that I do not know to express it in a mathematical formula does not mean that there is no more information. If a new version declare mate when there is no mate or crash or play stupid blunders in the first game then I know that I have a bug that I need to fix and it is probably significantly weaker than a stable version. I do not need to give it to play many games and watch it crash a lot of times in order to understand it. You can say that I cannot be sure and it is possible that the new version is better and the crash is something rare that almost never happens and I have not a model of probability that tells me what is the probability that it is better but it does not prevent me to say that I am almost sure that is weaker. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.