Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A rating inquiry

Author: Moritz Berger

Date: 11:57:49 10/11/98

Go up one level in this thread


On October 11, 1998 at 09:44:23, Enrique Irazoqui wrote:

>On this list, Fritz 5 is between 30 and 70 points higher than
>all the other top engines, so you would expect Fritz to score about 55 to 60%
>against them, and this is not necessarily true.

The percentage holds on the average in all kind of experiments I did without
opening book (or even the 1000 games Anand book I mentioned).

> If Fritz 5 plays 20 games long
>matches, it will get this score. If it plays 10 games matches, it won't.

From my matches, I cannot confirm this observation. I have also some hundred
games and didn't notice the phenomenon you describe. I will take a look again
specifically for the thing you described, but the last 40 games against R10 were
fairly even right from the start. With a clean book at the beginning.

> If in
>tournaments it plays a different opponent every game, it won't get that score
>either. In fact, it doesn't. Then, the SSDF rating list is no indication of the
>score Fritz 5 will get in a future event, unless this event reproduces exactly
>the SSDF way to test. In other words, this Elo list defeats its own purpose of
>being able to predict performances.

No, it doesn't. At least, it is no less reliable for Fritz than for Hiarcs,
Rebel, M-Chess, Genius, Shredder, ... you name it.

>An example I already posted: in my tournament of 200 games at 40:2, Fritz 5
>scored 44% in the first half and 59% in the last half. In the SSDF games, 64% in
>the first half and 72% in the second half. If you play a tournament of 20 games
>matches, you will get a very different performance if Fritz 5 plays these 20
>games in a row or if you split these matches in two halves by exiting the
>program and restart it again for the last 10 games.

SSDF plays on different machines. On one machine, learning from match 1
(opponent A) will postitively affect match 2 (opponent B). So SSDF is not quite
representative for results you or me would get using always the same machine,
esp. if you consider when interpreting their results like you did that they
presumably started to use the PowerBook from a certain point onward (1st half of
the match: fritz5.ctg, 2nd half: PowerBook ...).

> What's going to be the
>predicted performance after the Elo rating? It depends of how you make Fritz 5
>play. That's why I'm talking of an SSDF-specific rating.

Of course the rating is the direct result of testing parameters. Certainly the
relative rating on the SSDF list doesn't hold 100% against humans. I fully agree
on this.

>Again: I think Fritz 5 is very strong and a tactical wonder. I think the SSDF is
>not to blame for distortions in their rating list.

Come on, now you have to clarify your terms: What exactly does "distortion" mean
to you? Cheating? Isn't the manual opening book preparation of all other
programs much more "cheating" in a sense that the program doesn't develop its
own repertoire by its own playing strength and "understanding" of chess?
Interpreting a raw database of human games to still get a usable book seems to
be a greater obstacle than preparing books in decades of work like Sandro Necchi
(M-Chess) described in his interesting article on CCR (available to all readers
here at the CCC ressource centre).

> But these distortions are
>real. Learners can be SSDF specific, meaning: much more efficient in the SSDF
>way to play matches than in any other case, and this influences greatly this
>rating list.
>
>Enrique

Wasn't it you who fiercely advocated using learners on the SSDF list to overcome
the "killer book" problem and measure engine playing strength? My memory seems
to be disfunctional if the Fritz learning mechanism doesn't perfectly fit your
prescribed solution to the mess.

Moritz



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.