Author: Uri Blass
Date: 13:54:32 06/21/04
Go up one level in this thread
On June 21, 2004 at 16:14:26, Sandro Necchi wrote: >On June 21, 2004 at 15:58:07, Uri Blass wrote: > >>On June 21, 2004 at 15:05:32, Sandro Necchi wrote: >> >>>On June 21, 2004 at 10:30:33, martin fierz wrote: >>> >>>>On June 20, 2004 at 02:56:08, Sandro Necchi wrote: >>>> >>>>>There is a simple way to verify if the "authors" are correct or not. >>>>> >>>>>They should state clearly how to evaluate all the solutions of the tests >>>>>comparing the hardware to the SSDF one, in order to create the Elo figure. >>>>> >>>>>Then by choosing the next release of 5 commercial programs which will be tested >>>>>by SSDF they have to predict the Elo for ALL 5 chess programs with a + - of 10 >>>>>points. >>>>> >>>>>Than and indipendent tester should run the tests. >>>>> >>>>>If they fail, than they loose. >>>>> >>>>>Sandro >>>> >>>>+-10 elo, you must be kidding! >>>>the SSDF results themselves have larger error margins than that... >>>> >>>>cheers >>>> martin >>> >>>Not at all!! >>> >>>I did guessed the SSDF rating of Shredder 7 CB, Shredder 7.04 UCI and got very >>>close to Shredder 8 CB ( + - 10 points). >> >>If the rating of shredder against set A of opponents is 2800 and against set B >>is 2830 and if you do not know which set the ssdf is going to choose than you >>have at least 50% chance to be wrong by more than 10 elo. >> >>Uri > >Uri, > >this would be correct, but I have a way to measure somehow this also. > >This is how I did... > >Sandro In that case you measure not only rating but also some noise that may be relevant for the ssdf results but not to the real level of the program. I see no way to measure it unless you have people who play against opponents that you expect the ssdf to play against them. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.