Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Rules of thumb for strength estimation?

Author: Tim Foden

Date: 11:47:17 04/25/03

Go up one level in this thread


On April 25, 2003 at 14:24:23, Peter Fendrich wrote:

>On April 25, 2003 at 06:13:51, Tim Foden wrote:
>
>>On April 25, 2003 at 04:20:33, Albert Bertilsson wrote:
>>
>>>Hi!
>>>
>>>When testing my new engine versions against older versions I use the nice
>>>WhoIsBetter tool to determine weather or not the new version is likely to be
>>>stronger.
>>>
>>>But I would also like to know how much stronger. Just an estimation would be
>>>nice, just as an reward for the work. Putting the engine on FICS takes to long,
>>>so I wonder are there any rules of thumb that I can apply?
>>>
>>>Like:
>>>New engine scores 3 to 2?
>>
>> 60% = +70.44 ELO.
>>
>>>New engine scores 2 to 1?
>>
>> 66.66% = +120.36 ELO.
>>
>>>New engine scores 3 to 1?
>>
>> 75% = +190.85 ELO.
>>>
>>>Regards Albert
>>
>>Dann Corbit has a tool called USCF which can calculate such numbers.  I have a
>>modified version here.
>>
>>Here is a table of outputs which may be useful:
>>
>>A win percentage of 50% gives a rating difference of +0.00 ELO
>>A win percentage of 55% gives a rating difference of +34.86 ELO
>>A win percentage of 60% gives a rating difference of +70.44 ELO
>>A win percentage of 65% gives a rating difference of +107.54 ELO
>>A win percentage of 70% gives a rating difference of +147.19 ELO
>>A win percentage of 75% gives a rating difference of +190.85 ELO
>>A win percentage of 80% gives a rating difference of +240.82 ELO
>>A win percentage of 85% gives a rating difference of +301.33 ELO
>>A win percentage of 90% gives a rating difference of +381.70 ELO
>>
>>Cheers, Tim.
>
>One can't use this table or ELOSTAT or any other ELO rating formula.
>It will produce a figure but it doesn't mean anything.

:)  I know what you mean... but to be pedantic, I think you really mean
shouldn't rather than can't.  I.E. I can perfectly easily use this table to make
predictions about changes in strengh when there are few games.  Thus I
demonstably _can_ do it.  But I agree that I shouldn't really.  And in fact I
don't.  :)

>1) The ELO formula is based on the "Normal distribution" which is just an
>estimate of the real distribution.

OK.

>In order to be used as an estimate you need
>something about 30-50 games or more.

Again, I disagree.  :)  It _can_ be used as an "estimate" however many games you
have... it just won't be a very accurate one. :)

>2) Even if it was a perfect estimate the few games gives a very instable figure.
>For instance the difference between 2-2 and 2.5-1.5 gives a big difference in
>ELO but represent a very small difference in the results.

Very true.

Cheers, Tim.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.