Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Gambit Tiger@Athlon 500 only? (Junior=Athlon 1000)

Author: Christophe Theron

Date: 11:17:12 11/20/00

Go up one level in this thread


On November 20, 2000 at 05:45:48, stuart taylor wrote:

>On November 20, 2000 at 02:53:21, Christophe Theron wrote:
>

(snipped)

>>Everything is not all white or all black in computer chess, but one thing for
>>sure is that speed is a very important advantage, and this advantage can even be
>>mathematically measured (with appropriate margins of errors).
>>
>>There are several rules of thumb that are useful to know. One of them is that,
>>so far, doubling the speed of a computer accounts approximately for a 70 elo
>>points gain.
>>
>>Another rule that is very interesting: if you want to get an approximation of
>>the elo difference between 2 players, you take the winning percentage of the
>>strongest, substract 50, and multiply by 7. Use the rule only if the winning
>>percentage is below 80%.
>>
>>For example, if you win 65% of the time against me, then your elo is 105 elo
>>points above mine (15*7=105).
>>
>>So a 70 elo points difference (the one you get by doubling the speed of a
>>computer) means that a program running on twice the speed will win on average
>>60% of the time against the program running at "normal" speed.
>>
>>And one last thing: 70 elo points difference is the difference (approximately)
>>between the number one on the SSDF list and the number 5.
>>
>>Quite a difference, isn't it, for just a speed doubling.
>>
>>
>>
>>
>>>  No one can doubt for one momment the work you have done, and the great results
>>>thereof. I just thought even still, that advancement is quite gradual overall.
>>
>>
>>
>>Yes, it is. But I believe that the elo difference between Chess Tiger 12 and
>>Chess Tiger 13 is rather significant. Probably in the range 70 to 90 elo points.
>>
>>I don't know how to call this. A "jump" or a "gradual improvement"?
>>
>>
>>
>>
>>>But probably it is quite quick for such a delicate art.
>>>  It is very good that GT can play very risky, and  still be not less than
>>>perhaps anything previous, result wise. And, of course CT better still.
>>> When I used to play many computer/computer games e.g with same engine at
>>>different speeds, with programs that could be set in this way, I didn't always
>>>see a big difference in half or double the time.
>>
>>
>>
>>It is always interesting to notice the difference between the theorical result
>>and the actual result.
>>
>>It gives you an idea about the accuracy of experimental results, and the number
>>of experiments to do in order to be "close enough" to the theorical result.
>>
>>
>>
>>
>>>  Thank you for correcting me!
>>
>>
>>You are welcome.
>>
>>
>>
>>    Christophe
>
>I'm slowly begining to understand, and accept. Yes, Between Tiger and Tiger (12
>and 13), it WAS a little hop. If it was around last year, it would have stayed
>undisputedly in first place. Let's see if it stays there this year!
>  Thanks for the elo calculation, now I know how Uri Blass and others come up
>with these figures. I've read a few things about ELO ratings, but I had not
>found this.



I have found the "70 for doubling" by studying the SSDF list myself with a
spreadsheet. Other people have got the same result. For some programs it will be
closer to 55, and for some other it will be 85, but in average 70 is a rather
solid approximation.

The rule "substract 50 and multiply by 7" has been pointed out by people here on
CCC. I don't remember exactly who, but I think they will recognize themselves.

Once you know these rules, a lot of things begin to look less mysterious.

There is another important statistical rule to know. It is about the number of
games played in a match and the related error margin of these matches.

Many people ignore this rule and consequently make gross mistakes and post
meaningless results as if they were the absolute truth.

Here is a table which gives you the error margin on the winning percentage
depending on the number of games played:

games	 80% reliability	70% reliability
 10	14.0%	105pts
 20	11.0%	 77pts
 30	 9.0%	 63pts
 40	 8.0%	 56pts		7.0%	  49pts
 50	 7.0%	 49pts
100	 5.0%	 35pts
200	 3.5%	 25pts		3.0%	  21pts
400	 2.5%	 18pts		2.1%	  15pts
600	 2.2%	 15pts		1.7%	  13pts


How to read this table?

This table tells you that if you play a 10 games match between two players A and
B of equal strength, you must take into account a +/-14.0% error margin (which
means +/-105 elo points) on the result if you want to be 80% certain about the
winner.

So if you play 10 games and the result is 60% for player A, then you cannot say
with 80% confidence which player is the best. So a 6.0-4.0 result is not
significant. A 6.5-3.5 result begins to be significant with 80% confidence.

If you play 100 games and get a 53% result for player A, you still cannot say
with 80% confidence which player is the best. Because for 100 games the table
give +/-5%, so inside [45%;55%] you cannot say who is the winner.

I have also given the numbers for 70% reliability, but I did not compute them
all. I think it would also be interesting to have the numbers for 95%
reliability (the SSDF uses this level of confidence).

This "error margins" table is one of the hardest reality in chess. It looks like
90% or more of the people talking about chess totally ignore it. Unfortunately.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.