Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Rules of thumb for strength estimation?

Author: Peter Fendrich

Date: 11:24:23 04/25/03

On April 25, 2003 at 06:13:51, Tim Foden wrote:

>On April 25, 2003 at 04:20:33, Albert Bertilsson wrote:
>
>>Hi!
>>
>>When testing my new engine versions against older versions I use the nice
>>WhoIsBetter tool to determine weather or not the new version is likely to be
>>stronger.
>>
>>But I would also like to know how much stronger. Just an estimation would be
>>nice, just as an reward for the work. Putting the engine on FICS takes to long,
>>so I wonder are there any rules of thumb that I can apply?
>>
>>Like:
>>New engine scores 3 to 2?
>
> 60% = +70.44 ELO.
>
>>New engine scores 2 to 1?
>
> 66.66% = +120.36 ELO.
>
>>New engine scores 3 to 1?
>
> 75% = +190.85 ELO.
>>
>>Regards Albert
>
>Dann Corbit has a tool called USCF which can calculate such numbers.  I have a
>modified version here.
>
>Here is a table of outputs which may be useful:
>
>A win percentage of 50% gives a rating difference of +0.00 ELO
>A win percentage of 55% gives a rating difference of +34.86 ELO
>A win percentage of 60% gives a rating difference of +70.44 ELO
>A win percentage of 65% gives a rating difference of +107.54 ELO
>A win percentage of 70% gives a rating difference of +147.19 ELO
>A win percentage of 75% gives a rating difference of +190.85 ELO
>A win percentage of 80% gives a rating difference of +240.82 ELO
>A win percentage of 85% gives a rating difference of +301.33 ELO
>A win percentage of 90% gives a rating difference of +381.70 ELO
>
>Cheers, Tim.

One can't use this table or ELOSTAT or any other ELO rating formula.
It will produce a figure but it doesn't mean anything.
1) The ELO formula is based on the "Normal distribution" which is just an
estimate of the real distribution. In order to be used as an estimate you need
something about 30-50 games or more.
2) Even if it was a perfect estimate the few games gives a very instable figure.
For instance the difference between 2-2 and 2.5-1.5 gives a big difference in
ELO but represent a very small difference in the results.

/Peter

Re: Rules of thumb for strength estimation? Albert Bertilsson 12:50:42 04/25/03
- Re: Rules of thumb for strength estimation? Peter Fendrich 14:01:25 04/25/03
Re: Rules of thumb for strength estimation? Tim Foden 11:47:17 04/25/03
- Re: Rules of thumb for strength estimation? Peter Fendrich 14:19:10 04/25/03
  - Re: Rules of thumb for strength estimation? Tim Foden 15:47:32 04/25/03

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.