Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF(Shredder4-Nimzo7.32) AMD K6-2 450 23-17

Author: Christophe Theron

Date: 01:29:15 03/21/00

Go up one level in this thread


On March 20, 2000 at 20:08:02, blass uri wrote:

>On March 20, 2000 at 19:04:47, Christophe Theron wrote:
>
>>On March 20, 2000 at 17:28:37, blass uri wrote:
>>
>>>On March 20, 2000 at 13:56:59, Christophe Theron wrote:
>>>
>>>>On March 20, 2000 at 06:46:37, Bertil Eklund wrote:
>>>>
>>>>>On March 19, 2000 at 22:26:49, Christophe Theron wrote:
>>>>>
>>>>>>On March 19, 2000 at 15:41:30, Bertil Eklund wrote:
>>>>>>
>>>>>>>
>>>>>>>Hi!
>>>>>>>
>>>>>>>A very impressive result from Shredder4.
>>>>>>>
>>>>>>>IMO Shredder plays positionally very good and excellent in the endgames.
>>>>>>>Nimzo is a bit stronger tactically.
>>>>>>>
>>>>>>>Shredder4 used all 4 Turbo-CDs.
>>>>>>>
>>>>>>>Bertil
>>>>>>
>>>>>>
>>>>>>Bertil,
>>>>>>
>>>>>>I am not sure this message is going to be well accepted. So let me first state
>>>>>>that I have the greatest respect for your work and the SSDF.
>>>>>>
>>>>>>Let me also state that I have a lot of respect for Nimzo, Shredder, and their
>>>>>>respective authors.
>>>>>Yes it's all great programs.
>>>>>
>>>>>
>>>>>>However, I can only strongly disagree with your sentence "a very impressive
>>>>>>result from Shredder4".
>>>>>
>>>>>57,5% against a program known as one of the best on tournament time-control
>>>>>impressed at least me. I only talk about this 40 game match. Maybe it loses to
>>>>>Tiger in the next match but it's another match.
>>>>
>>>>
>>>>Maybe Tiger loses, actually I do not know.
>>>>
>>>>But 57.5% must be taken with a statistical grain of salt. From the statistical
>>>>data I have, and I'm open to discussion about this, on a 40 games match you can
>>>>expect the error margin to be +/- 8.0% if you want 80% confidence.
>>>
>>>1)If you assume probability of 50% for win and of 50% for loss between equal
>>>players and assume that colours of the players are not relevant the standard
>>>error is
>>>sqrt(0.5*0.5*40)=sqrt(10)>3.1 points and in this case 3 is almost the standard
>>>error
>>>
>>>3.2/40=8% so in this case the error margin is really +/- 8.0%
>>
>>
>>I was assuming 1/3 wins, 1/3 draws, 1/3 losses.
>
>If this is the case then the error margin is sqrt(2/3*10)<2.6 points
>
>2.6/40=6.5% but this assumption is not logical because it does not consider the
>fact that white has better chances.


I am also wondering if it is the case in comp-comp games.

A study of SSDF databases about winning percentages of black and white and draw
rate would be very interesting.



>>>2)If you assume probability of 20% for win and of 20% of loss and 60% for a draw
>>>between equal players(colours are not relevant) the standard error is:
>>>sqrt(0.4*0.5*0.5*40)=sqrt(4)=2
>>>
>>>when 0.4*0.4*0.5 is the variation in on game
>>>0.4*0.5*0.5*40 is the variation in 40 games
>>>and I do square root of it to calculate the standard error.
>>>
>>>In this case the standard error is only 5%.
>>>I think this assumption assumes more draws then there are between computers.
>>>
>>>3)If you assume 40% for white 30% for a draw 30% for black between equal players
>>>then the variance in one game is
>>>0.4*0.45*0.45+0.3*0.05*0.05+0.3*0.55*0.55=0.4*0.2025+0.3*0.0025+0.3*0.3025=
>>>0.1725
>>>
>>>In this case the variance in 40 games is 0.1725*40=7.1 and the standard
>>>deviation is sqrt(7.1)<2.7
>>>
>>>2.7/40=6.75% and the standard deviation is +-6.7%
>>
>>
>>Isn't it closer to 6.8?
>
>No because 2.7/40 is slightly bigger then the standard deviation.
>
>I was wrong in my calculation(I used my head and not a computer and I see now
>that 0.1725*40=6.9 and not 7.1)
>
>I have even sqrt(6.9)<2.65 and 2.65/40=6.625% so the standard deviation is
>+-6.6%


OK.



>>>The last case seems to be something close to the realistic case in games between
>>>equal programs(I believe that there are more draws between equal programs and
>>>this reduce the standard deviation but I am not sure)
>>
>>
>>I have no evidence that the rate of draws is higher between equal programs.
>>Maybe it's possible to make a study from the database of SSDF games?
>
>
>I did not try to do it but I believe that it is the case.
>If we take extreme case then the better program has 100% and there are no draws.
>
>
>>>The probability for a draw is also dependent on the style of the programs.
>>
>>
>>Style is not part of my maths. I'm just a bean counter. :)
>>
>>
>>
>>    Christophe
>
>I understand.
>
>The problem is complicated enough even without considering the style so we can
>ignore small errors because of not considering the style of the program because
>we need a lot of games to see if there is significant difference in the number
>of draws between programs and we probably have not enough games for it.


That's right.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.