Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF, Fritz5 games

Author: Robert Hyatt

Date: 15:37:17 02/28/98

Go up one level in this thread


On February 28, 1998 at 17:16:49, Thorsten Czub wrote:

>
>>I wasn't implying anything wrong at all.  Just the huge one-sided wins
>>by fritz looks amazing until I noticed the 3X speed handicap.  That
>>makes
>>those particular win/lose numbers mean something different than if they
>>were
>>posted for equal hardware matches...
>
>I think Bob talks about the same I do.
>
>The ELO generated out of these results has not the same QUALITY it would
>have had if you would have used SAME machines.
>You cannot turn this arround and argue:
>But with hiarcs the things worked too.
>If it works with hiarcs, the same method does not have to work with
>fritz5 too.
>The 2 different programs get their playing strength from 2 different
>things.
>If you change ONE MAIN parameter in your experiment, and the parameter
>advances Fritz, than you don't get AEQUIVALENT or comparable results.


I think this shows up a common misconception about Elo's rating system.
This produces a rating "spread" (not absolute values) that can be used
to
directly compute the probability of any two players beating each other
based only on their Elo-computed ratings.  It has *nothing* to do with
the corresponding FIDE rating a program might earn.  You might play two
programs against each other and after 1000 games end up with ratings
exactly 200 apart.  You might then enter them in human tournaments to
play 1000 games each, and when you finish you might find they are only
50 rating points apart.  Because you are using *two different player
pools* to compute those ratings.  Elo's statistical analysis depends on
significant numbers of the "pool" playing each other, and it doesn't
take
into account the bizarre way computers do things.

But even worse, the only important thing in the Elo system is the
"spread"
between two players, not the absolute values of their ratings.  That's
where
we get off into no-mans-land statistically.  IE Fritz 5 is 2585 (or so)
on the
SSDF list, while Hiarcs is 2535 (or so).  I'd claim that both are 200
rating
points too high, if you compare those numbers to human numbers.  But the
spread might remain constant no matter what, and would continue to
predict
the same win/loss ratio...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.