Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Does the New SSDF List Reflect the Real Strength of Programs?

Author: José Carlos

Date: 10:30:37 10/24/01

Go up one level in this thread


On October 24, 2001 at 01:21:52, Dann Corbit wrote:

>On October 24, 2001 at 01:18:29, Kevin Stafford wrote:
>
>>If you are commenting on how the ssdf's ratings compare to FIDE ratings, there
>>is no real sense of 'accurate'. The pools are entirely separate, and therefore
>>attempts at comparison between the two are meaningless. It is for this reason
>>that it is impossible for one list to be 'underrated', because the two lists
>>have nothing to do with one another.
>
>I think that this statement is a bit too strong.  Surely, there is some
>correlation between the strength ratings on the two lists.  We just have no idea
>what it is!

  I disagree. I believe there's no correlation at all. Human chess and computer
chess are different games IMO. The only way to compare human and computer
players is let them play.
  As an example of what I mean, imagine this: I spend some months studying old
programs books (I say old programs because they didn't learn), and I find
thousands of lines to win against those books. Then I bould a book with such
lines and let my (weak) program play in the SSDF (or any other
automated-games-based rating list). The games won against the old programs would
be enough to give my program a much higher rating than what it deserves. Against
humans, I would continue to lose the same way as before.
  It's an extreme case, I know, but it shows one of the differences between
computer chess and human chess.

>I also would not go so far as to say that comparisons are meaningless -- just
>that the numerical value connections are unknown.

>An entity that is at the top of either list will be quite strong, and one at the
>bottom not so strong -- that much is obvious.

  Obvious yes, but too vague to be considered a correlation.

>>>  I hate to open up a can of worms here, but it would seem that recent results
>>>suggest that the SSDF list is Pretty Accurate. Tiger performed at the 2700 level
>>>on hardware much inferior to that used by the SSDF. That fact may suggest that
>>>the List is Underrated. Deep Fritz result against the Veteran Grandmaster Robert
>>>Huebner adds further validity. I am not sure what Rebel's performance rating
>>>with Vanderwiel is, but I am sure it is over 2600, this achieved on hardware
>>>slower then that used by the SSDF. I commend the SSDF for doing an excellent
>>>Job, Perhaps more games against Humans will continue to collaborate their fine
>>>work. Maybe in the future SSDF will have to add points to the current list,
>>>instead of subtracting!

  IMO, differnt pools can't be compared, even if the have a common element.
Tipical example: A beats regularly B; B beats regularly C; we have no idea if A
will beat C or not. If we apply this to sets of players, you'll see my point.
Example (not necessarily from 'real life'): Rebel does well agains humans; some
programs do well against Rebel; but we have no idea how good those programs will
do against humans until they play.
  So, I can't see any correlation.

  Just my opinion.

  José C.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.