Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF(Shredder4-Nimzo7.32) AMD K6-2 450 23-17

Author: blass uri

Date: 17:08:02 03/20/00

Go up one level in this thread


On March 20, 2000 at 19:04:47, Christophe Theron wrote:

>On March 20, 2000 at 17:28:37, blass uri wrote:
>
>>On March 20, 2000 at 13:56:59, Christophe Theron wrote:
>>
>>>On March 20, 2000 at 06:46:37, Bertil Eklund wrote:
>>>
>>>>On March 19, 2000 at 22:26:49, Christophe Theron wrote:
>>>>
>>>>>On March 19, 2000 at 15:41:30, Bertil Eklund wrote:
>>>>>
>>>>>>
>>>>>>Hi!
>>>>>>
>>>>>>A very impressive result from Shredder4.
>>>>>>
>>>>>>IMO Shredder plays positionally very good and excellent in the endgames.
>>>>>>Nimzo is a bit stronger tactically.
>>>>>>
>>>>>>Shredder4 used all 4 Turbo-CDs.
>>>>>>
>>>>>>Bertil
>>>>>
>>>>>
>>>>>Bertil,
>>>>>
>>>>>I am not sure this message is going to be well accepted. So let me first state
>>>>>that I have the greatest respect for your work and the SSDF.
>>>>>
>>>>>Let me also state that I have a lot of respect for Nimzo, Shredder, and their
>>>>>respective authors.
>>>>Yes it's all great programs.
>>>>
>>>>
>>>>>However, I can only strongly disagree with your sentence "a very impressive
>>>>>result from Shredder4".
>>>>
>>>>57,5% against a program known as one of the best on tournament time-control
>>>>impressed at least me. I only talk about this 40 game match. Maybe it loses to
>>>>Tiger in the next match but it's another match.
>>>
>>>
>>>Maybe Tiger loses, actually I do not know.
>>>
>>>But 57.5% must be taken with a statistical grain of salt. From the statistical
>>>data I have, and I'm open to discussion about this, on a 40 games match you can
>>>expect the error margin to be +/- 8.0% if you want 80% confidence.
>>
>>1)If you assume probability of 50% for win and of 50% for loss between equal
>>players and assume that colours of the players are not relevant the standard
>>error is
>>sqrt(0.5*0.5*40)=sqrt(10)>3.1 points and in this case 3 is almost the standard
>>error
>>
>>3.2/40=8% so in this case the error margin is really +/- 8.0%
>
>
>I was assuming 1/3 wins, 1/3 draws, 1/3 losses.

If this is the case then the error margin is sqrt(2/3*10)<2.6 points

2.6/40=6.5% but this assumption is not logical because it does not consider the
fact that white has better chances.
>
>
>
>
>>2)If you assume probability of 20% for win and of 20% of loss and 60% for a draw
>>between equal players(colours are not relevant) the standard error is:
>>sqrt(0.4*0.5*0.5*40)=sqrt(4)=2
>>
>>when 0.4*0.4*0.5 is the variation in on game
>>0.4*0.5*0.5*40 is the variation in 40 games
>>and I do square root of it to calculate the standard error.
>>
>>In this case the standard error is only 5%.
>>I think this assumption assumes more draws then there are between computers.
>>
>>3)If you assume 40% for white 30% for a draw 30% for black between equal players
>>then the variance in one game is
>>0.4*0.45*0.45+0.3*0.05*0.05+0.3*0.55*0.55=0.4*0.2025+0.3*0.0025+0.3*0.3025=
>>0.1725
>>
>>In this case the variance in 40 games is 0.1725*40=7.1 and the standard
>>deviation is sqrt(7.1)<2.7
>>
>>2.7/40=6.75% and the standard deviation is +-6.7%
>
>
>Isn't it closer to 6.8?

No because 2.7/40 is slightly bigger then the standard deviation.

I was wrong in my calculation(I used my head and not a computer and I see now
that 0.1725*40=6.9 and not 7.1)

I have even sqrt(6.9)<2.65 and 2.65/40=6.625% so the standard deviation is
+-6.6%


>
>
>
>>The last case seems to be something close to the realistic case in games between
>>equal programs(I believe that there are more draws between equal programs and
>>this reduce the standard deviation but I am not sure)
>
>
>I have no evidence that the rate of draws is higher between equal programs.
>Maybe it's possible to make a study from the database of SSDF games?


I did not try to do it but I believe that it is the case.
If we take extreme case then the better program has 100% and there are no draws.


>>The probability for a draw is also dependent on the style of the programs.
>
>
>Style is not part of my maths. I'm just a bean counter. :)
>
>
>
>    Christophe

I understand.

The problem is complicated enough even without considering the style so we can
ignore small errors because of not considering the style of the program because
we need a lot of games to see if there is significant difference in the number
of draws between programs and we probably have not enough games for it.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.