Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Shredder crushing Chess Tiger.

Author: Dieter Buerssner
Date: 12:55:37 12/15/03
On December 15, 2003 at 15:26:11, Andrew Dados wrote:

> If one program is better by 100 elo, what is chance of draw outcome in single
>game? (and consequently what is w/d/l distribution) Simple model assumes this
>should not depend on their average strength, yet in practice it makes big
>difference (of course more draws as players strength increase). Also your note
>about biased score towards white adds some complexity.
>
> Since we have no idea what is expected distribution of w/d/l (you assumed 1/3
>each), we can't correctly predict win/lose chances. Could you some day possibly
>rerun your simulation with different w/d/l distribution (but yielding same
>rating difference)? I am curious how stable are the numbers in that table...
>
>My very simple simulation:
>For program A better then B by 100 elo expected score is 0.69 . Lets play a 10
>game match (100 000 times):
>
>a) assuming win chance of 0.59 and draw chance of 0.2:
>A wins 0.895% matches, draws 0.050% and loses 0.054%
>
>b) assuming win chance of 0.49 and draw chance of 0.4 (so same expected score):
>A wins 0.934% matches, draws 0.039% and loses 0.025%
>
>While I still have no idea what would be real chance of draw between those
>programs, I can say it influences our expected score table (even error column)
>greatly...

I use some Monte Carlo simulation, as well, to judge the outcome of some match.
I use different probabilities for Player A/B wins/draw/loses as black/white. I
take the probabilities from an actual former match (typically, Player A is my
engine, and changed, and Player B stays the same). I take the probabilities from
the former match. With this I calculate (by Monte Carlo simulation - I guess
analytically would also be possible, but more effort for my brain :-) the
expected distribution of the results. Beeing optimistic, and assuming the new
version (of player A) got a better result. Say 105-95 vs. 115-85. Now, I look at
the probabiltiy for result >= 115-85 in the distribution simulated from the old
105-95 result. If this is rather low, I assume, new player A was better. I am
aware, this is not totally sound. But I think it gives a good impression. I
would not conclude, (I did not simulate these numbers, just made them up for the
argument) that if that probability of 115-85 and better was 10%, I would say (in
the typical statistical terms) I have 90% confidence, that it really was better.

I have posted a similar program in source code here. If you are interested, I
can post it, or send it by mail. Because of various options, and a PRNG that
uses quite a few lines - so the source is not so short, perhaps email would be
better.

Regards,
Dieter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.