Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF validation proposal

Author: Robert Hyatt

Date: 20:27:31 01/06/00

Go up one level in this thread


On January 06, 2000 at 20:24:38, Peter Fendrich wrote:

>On January 05, 2000 at 22:36:20, Robert Hyatt wrote:
>
>>On January 05, 2000 at 18:47:14, Peter Fendrich wrote:
>>
>>>On January 05, 2000 at 16:32:12, Robert Hyatt wrote:
>>>
>>>>On January 05, 2000 at 14:47:11, Chris Carson wrote:
>>>>
>>>>>In my opinion SSDF does not need more external
>>>>>validation (some human games are included in the ratings).
>>>>
>>>>this is not correct.  human ratings were used (IIRC) maybe up to 1993 at
>>>>the latest.  7 years washes _all_ the 'humaness' out of the SSDF rating
>>>>pool, since they have played thousands of games since the last human game
>>>>was included.
>>>
>>>The human results used are not washed out more today than 1993. The results are
>>>still used in the same way with the same impact as then. The ordinary K-constant
>>>doesn't apply here. The human results are only used to adjust the level of the
>>>list.
>>>
>>>That doesn't say it is a human list in any way. And there are problems:
>>>a) It is far too few games between humans and chess programs.
>>>b) Are games played 10 years ago still giving the same information?
>>>Probably not, because of the increased knowledge about of how to play computers.
>>>I would think that humans are much more prepared for the computer style today
>>>than 10 year ago.
>>>
>>>As a pure program vs program rating list, giving the differences between chess
>>>programs, it is very accurate ratings IMO. The adjustment to human levels
>>>however are only helping us to get a rough estimate of how to compare these
>>>ratings to human ratings.
>>>//Peter
>>
>>
>>Sorry, but I disagree.  The human ratings were against programs that are over 7
>>years old.  Since then it has been _only_ computer vs computer... with no humans
>>in the pot to influence the pool.  You have nearly 7 generations of programs
>>that have played each other since the last human vs program game was included
>>for rating purposes...
>
>I'm not sure where we disagree but I think that there is some misunderstanding
>about how the human games are used. Despite the age of these games they still
>have "the same" influence on the rating list as back in -93.


Not exactly.  The ratings were adjusted to correlate with some human vs computer
results that were known.  But that set the ratings for a couple of programs back
in 1992-93.  Since that time, thousands of new games, computer vs computer were
played.  And as I mentioned, a small change to a program can produce a wide
gap in Elo performance.  With no checks and balances to keep the list in line
with the old Swedish federation ratings.  The ratings were never calibrated to
FIDE ratings, and after 7 years, there is little doubt that they have drifted
over 200 points above FIDE, at least.

At least I don't believe any program is a 2700 player, on today's hardware,
which puts them in the top 10 or so of the worlds best players.  I don't believe
they are in the top 100 yet...

>
>>
>>IE the computer rating pool was 'seeded' with human ratings, but then the
>>two pools became 100% disjoint.  Ratings today have _nothing_ to do with FIDE,
>>or any other rating pool...
>
>The two pools were disjoint from the very start and still are. The order between
>members in the list and the differences between the ratings are accurate. The
>absolute rating figures, however, are hard to compare with other rating lists
>and that goes for all other rating pools as well. That's why I call it a rough
>adjustment to the human levels.
>
>//Peter


Right...  but we probably don't agree on the adjustment.  I say SSDF-200 is
an _upper_ bound on the program ratings, when comparing them to FIDE ratings.
200 might be too small, but probably not by a lot.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.