Author: Peter Fendrich
Date: 14:19:10 04/25/03
Go up one level in this thread
On April 25, 2003 at 14:47:17, Tim Foden wrote: >On April 25, 2003 at 14:24:23, Peter Fendrich wrote: > >>On April 25, 2003 at 06:13:51, Tim Foden wrote: >> >>>On April 25, 2003 at 04:20:33, Albert Bertilsson wrote: >>> >>>>Hi! >>>> >>>>When testing my new engine versions against older versions I use the nice >>>>WhoIsBetter tool to determine weather or not the new version is likely to be >>>>stronger. >>>> >>>>But I would also like to know how much stronger. Just an estimation would be >>>>nice, just as an reward for the work. Putting the engine on FICS takes to long, >>>>so I wonder are there any rules of thumb that I can apply? >>>> >>>>Like: >>>>New engine scores 3 to 2? >>> >>> 60% = +70.44 ELO. >>> >>>>New engine scores 2 to 1? >>> >>> 66.66% = +120.36 ELO. >>> >>>>New engine scores 3 to 1? >>> >>> 75% = +190.85 ELO. >>>> >>>>Regards Albert >>> >>>Dann Corbit has a tool called USCF which can calculate such numbers. I have a >>>modified version here. >>> >>>Here is a table of outputs which may be useful: >>> >>>A win percentage of 50% gives a rating difference of +0.00 ELO >>>A win percentage of 55% gives a rating difference of +34.86 ELO >>>A win percentage of 60% gives a rating difference of +70.44 ELO >>>A win percentage of 65% gives a rating difference of +107.54 ELO >>>A win percentage of 70% gives a rating difference of +147.19 ELO >>>A win percentage of 75% gives a rating difference of +190.85 ELO >>>A win percentage of 80% gives a rating difference of +240.82 ELO >>>A win percentage of 85% gives a rating difference of +301.33 ELO >>>A win percentage of 90% gives a rating difference of +381.70 ELO >>> >>>Cheers, Tim. >> >>One can't use this table or ELOSTAT or any other ELO rating formula. >>It will produce a figure but it doesn't mean anything. > >:) I know what you mean... but to be pedantic, I think you really mean >shouldn't rather than can't. I.E. I can perfectly easily use this table to make >predictions about changes in strengh when there are few games. Thus I >demonstably _can_ do it. But I agree that I shouldn't really. And in fact I >don't. :) > >>1) The ELO formula is based on the "Normal distribution" which is just an >>estimate of the real distribution. > >OK. > >>In order to be used as an estimate you need >>something about 30-50 games or more. > >Again, I disagree. :) It _can_ be used as an "estimate" however many games you >have... it just won't be a very accurate one. :) > >>2) Even if it was a perfect estimate the few games gives a very instable figure. >>For instance the difference between 2-2 and 2.5-1.5 gives a big difference in >>ELO but represent a very small difference in the results. > >Very true. > >Cheers, Tim. Well Tim, I wont argue against that but _can't_ here was a 'sloppy' way to say _can't be done properly_ or something like that. I don't know if this makes any sense in English though... /Peter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.