Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF ratings vs Human performance rating (the data).

Author: Robert Hyatt

Date: 16:10:22 04/14/00

Go up one level in this thread


On April 14, 2000 at 05:14:33, Graham Laight wrote:

>On April 13, 2000 at 20:52:19, Robert Hyatt wrote:
>
>>On April 13, 2000 at 05:46:58, Graham Laight wrote:
>>
>>>On April 12, 2000 at 21:51:01, James Robertson wrote:
>>>
>>>{snip}
>>>
>>>>I think that most of the older lower rated results _are_ accurate because the
>>>>list was calibrated using those results. But the results of the newer programs
>>>>whose ratings were _not_ calibrated against human lists show a remarkable
>>>>difference between their actual performances. Fritz is just the grossest example
>>>>of this, exhibiting a 232 point difference between its performance against
>>>>computers and its performance against humans.
>>>
>>>Unless Fritz's TPR rating is based on a larger scale test than the others, you
>>>can't really say this.
>>>
>>>I wouldn't expect most TPRs to closely match a player's real rating. The real
>>>surprise is that, after 8+ years since the "official" calibrations, so many of
>>>the TPRs are quite close to the SSDF rating - some of the even higher, indeed.
>>>
>>>If I were a lawyer out to prove that the SSDF ratings were inflated, I wouldn't
>>>draw attention to the statistics which Chris has put forward. I would
>>>strenuously avoid posting on this thread (like Bob has done!  :-)   ).
>>>
>>>-g
>>
>>
>>I haven't "avoided" anything. Chris's data has a _huge_ error margin.  One
>>program had one game for its "tpr" calculation.  WIth an error of +/- 400 or
>>more.
>>
>>We are getting real data from Rebel and the Israel games... I have time to
>
>Sorry - I can't agree in the case of Rebel Century. People say it has a very
>nice postional style of play - but the fact is that it isn't rated by SSDF, so
>it contributes nothing to our knowledge of SSDF rating accuracy.
>
>>wait, rather than jumping the gun...  For every program close to its SSDF
>>rating we will see one _way_ below.  What is the conclusion?  That the average
>>is way high on the SSDF, but just wait for time to give good numbers...
>
>The information we have now isn't perfect by any means - it's only an indicator.
>
>But you've been telling us for months now that we just have to wait some more
>for sufficient data. I just get the feeling that by the time you tell us you're
>happy with the evidence, the human brain will be obsolete!  :-)
>
>-g


The problem is that data isn't coming quickly.  So this is not 'time-based' it
is 'games-played' based.  Rebel was playing a game per month.  10 games hardly
reduces the error rate to something worth discussing...  But you may be right,
as I suspect the games are going to become less and less frequent...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.