Author: Robert Hyatt
Date: 20:18:35 11/24/03
Go up one level in this thread
On November 24, 2003 at 18:55:18, Sune Fischer wrote: >On November 24, 2003 at 14:07:27, Robert Hyatt wrote: > >>>The paper is a bit mathematical, but the fact that the likelihood does not >>>depend on the number of draws can be explained intuitively rather easily: >>>imagine a game called "chess+" where no draw is possible: each time a game is >>>drawn, the two players start over from the initial position until one player >>>wins. Draws are not counted. For the exact same sequence of games, depending on >>>whether you consider they play chess or chess+, the score will be 1006-1000 or >>>6-0. Obviously, the likelihood that one is better than the other is the same. >> >>I _totally_ disagree with that. Say we play tennis matches, with no tie-breaks. >>We play 1000 sets and they all end 6-6. Then I win the 1001th set. You really >>conclude that provides no more information about our skills than a single game >>that ends 5-7? The 1000 ties suggests a _lot_ about how close we are while >>the 1 set says very little. >> >>Draws count. That's why the Elo formula specifically includes draws in the >>calculation... > >You are looking at it the wrong way. >The question we want to answer is "who is better", not "how much better" or any >other related question. > >Given the answer we seek you must admit that the draws give us no information. >In fact, it doesn't matter how high the probability of a draw is, because we >care only about the probability of winning or losing. > >Whether we get 2% draws or 98% draws says nothing about what happens in the >remaining 98% respectively 2% of the games, and that *only that* is what we are >interested in. That's a problem, IMHO. IE I get sick and lose one set. Am I _really_ worse, when we have played 1000 sets all to draws? > >>>Of course, this is true only if the hypotheses are true: games are independent >>>random events, and the prior is uniform (which is reasonable in comp-comp >>>matches without learning). >> >> >>This seems to fail for sampling theory. You have 1006 games to choose from. >>Choose N random samples of 6 games. You will be convinced that the two players >>are very close, even though one won 6 more games than the other. But most of >>your random samples just get draws. now take your 6 won games by themselves. >>The only 6-game sample you can take is 6-0 which suggests that the 6-side is >>way better. > >If you consider the Elo rating you must have knowledge of the entire >distribution which would include knowledge of draws, however that is not the >object. > >>If all you care about is "who is better" then omitting the draws makes >>some kind of sense, but it doesn't give any idea _how_ much better one >>is than the other. 500 rating points or .001 rating points. I believe >>that is important information. > >Actually, this isn't that important for incremental improvements. >You make a new version of your engine, the primary question is "is it better or >worse?". >Secondary is "how much better is it?", but actually we can live without >answering that at all, your new version is better so scrap the old and continue >development on this one. > >> Particularly since we are dealing with >>humans and computers that can "get sick". Suppose on a normal day we >>can only draw, but I get sick and lose 6 in a row. You conclude you >>are better. You are wrong. The 1000 draws are much more representative >>of how we compare than the 6 wins/losses, in this case. > >You are mixing up the two question because you feel that being 0.001 better is >being equal, and it isn't in a mathematical sense. If we played at the same level _every_ set, game or match, I'd agree. But humans don't do that. with 1000 draws and 1 win I would _not_ say the person with the 1 win is better, in any way... > >-S.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.