Computer Chess Club Archives




Subject: Re: Proving something is better

Author: Peter Fendrich

Date: 15:18:43 12/21/02

Go up one level in this thread

On December 21, 2002 at 16:51:59, Uri Blass wrote:

>On December 21, 2002 at 16:39:01, Peter Fendrich wrote:
>>On December 20, 2002 at 20:01:17, Uri Blass wrote:
>>>On December 20, 2002 at 19:56:58, Peter Fendrich wrote:
>>>>On December 20, 2002 at 19:30:38, Uri Blass wrote:
>>>>>On December 20, 2002 at 19:07:04, Peter Fendrich wrote:
>>>>>>On December 20, 2002 at 12:16:25, Uri Blass wrote:
>>>>>>>On December 20, 2002 at 11:03:14, Peter Fendrich wrote:
>>>>>>>>On December 20, 2002 at 04:10:35, Rémi Coulom wrote:
>>>>>>>>>On December 19, 2002 at 19:28:01, Peter Fendrich wrote:
>>>>>>>>>>I did, some 15-20 years ago, in the Swedish "PLY" a couple of articles that
>>>>>>>>>>later became the basics for the SSDF testing.
>>>>>>>>>>A year or so ago you posted a question about how to interpret results with very
>>>>>>>>>>few games. In a another thread I posted a new theory for this as an answer
>>>>>>>>>>"Match results - a complete(!) theory (long)".
>>>>>>>>>>I also made a program to use for this that can be found at Dann's ftp site.
>>>>>>>>>Hi Peter,
>>>>>>>>>If you had not noticed it, you can take a look at a similar program I have
>>>>>>>>>Basically, I started with the same theory as you did, but I went a bit farther
>>>>>>>>>in the calculations. In particular, I proved that the result does not depend on
>>>>>>>>>the number of draws, which is intuitively obvious once you really think about
>>>>>>>>>it. I also found a more efficient way to estimate the result. I checked the
>>>>>>>>>results of my program against yours and found that they agree.
>>>>>>>>For me it's not so obvious that you can through the draws out.
>>>>>>>>I just took a short look at your paper and maybe I misunderstood some of it.
>>>>>>>>Take this example: A wins to B by 10-0
>>>>>>>>Compared with: A wins to B by 10-0 and with additional 90 draws.
>>>>>>>>Not counting the draws will get erronous results.
>>>>>>>>The results between our programs shouldn't agree, I think, because I heavily
>>>>>>>>relies on the trinomial distribution (win/draw/lose). One can use the binomial
>>>>>>>>function (win/lose) and add 0.5 to both n1 and n0 for draws. That will probably
>>>>>>>>give a fairly good approximate value but the only correct distribution is the
>>>>>>>If the target is only to find which programs is better we can throw draws.
>>>>>>>You can imagine the following game chessa:
>>>>>>>One game of chessa includes at least one game of chess.
>>>>>>>chessa is finished only when a chess game is finished in a win.
>>>>>>>if a chess game that is played as part of chessa is finished in a draw then
>>>>>>>chessa continues and the sides play chess with opposite colors.
>>>>>>>By these rules in both cases the winner won 10 games of chessa with no draws
>>>>>>>(draw in chessa cannot happen).
>>>>>>In that case you don't need anything more than the result.
>>>>>>What I'm doing is producing a statment like:
>>>>>>A is better than B with the probability of x%.
>>>>>>The 10-0 result will raise x very high but the 55-45 result will lower the
>>>>>>probability even if A is still regarded as the best.
>>>>>if the 55-45 is result of 90 draws then 55-45 give the same probability that the
>>>>>winner is better as 10-0.
>>>>>The draws are only relevant for estimate of the difference in rating but not for
>>>>>deciding about the better player.
>>>>That is essentially the same thing. Different estimates of rating gives
>>>>different probabilities of A beating B. The both are closely related.
>>>>If the ratings are changed the probabilties should be changed.
>>Do you really mean that increased rating diff doesn't mean increased probability
>>that A beats B and vice versa?
>difference in rating does not give probabilities for a draw.

Of course it does: Big difference, low probability of a draw. Low difference,
more probability for a draw.

>>>It is not the same suppose player A beat B 1000-0 with 999999000 draws
>>>you are going to have no doubt that A is better but if the result is
>>>500000500-499999500 with no draw then it is clearly possible that the results
>>>are random.
>>First: Then you are saying that draws has something to with it.
>>Second: That is not the full answer (A is better than B). A is probably better
>>than B, we know that. The question is how confident we are saying that.
>>That confidence is changing if the ratings are. I would say that it's more
>>likely that a 2800 rated player will beat you than a 1800 rated player.
>>Wouldn't you?
>>Third: "no doubt" doesn't mean anything meassurable. There is always a doubt,
>>even if it's small.
>I meant to say that practically the doubt is very small when you seee 1000-0
>with 999999000 draws so practically you can be sure.

I can't see how that proves your point. 1000-0 is a very big number and so is
all those draws.
You're right that 1000-0 with 999999000 draws is more in favour for A than the
result 500000500-499999500 without draws but that doesn't tell me anything about
removing the draws and look at only the result 1000-0 and that's what the
question is about (removing all the draws).

If you agree with me that a draw result will affect the rating than you are also
saying that it affects the probability that A is better than B. The rating is
all about what probability we have for A being better than B.


>>>If you throw a fair coin 1000000000 times then in most of the cases the
>>>difference between the number of heads and the number of tails is going to be
>>>more than 1000.

This page took 0.07 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.