Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF Rating list

Author: Dann Corbit

Date: 16:35:30 12/27/01

Go up one level in this thread


On December 27, 2001 at 19:06:38, Mark Young wrote:

>On December 27, 2001 at 18:36:39, Dann Corbit wrote:
>
>>On December 27, 2001 at 18:17:52, Mark Young wrote:
>>
>>>Nice to see that SSDF now shows what some of us have been showing in our own
>>>testing for some time now. ChessTiger 14 is the strongest of the Tiger programs
>>>ahead of GambitTiger 2.0, and also shows ChessTiger 14 to be the strongest
>>>program ahead of Deep Fritz.
>>
>>How did you arrive at those conclusions?
>
>What do you mean.... Just noting that SSDF now shows ChessTiger 14 to be the
>strongest program. That agrees with my testing, and what others have been
>saying. I thought it was clear from the post how I arrived at my conclusions.

 THE SSDF RATING LIST 2001-12-27   83119 games played by  229 computers
                                           Rating   +     -  Games   Won  Oppo
                                           ------  ---   --- -----   ---  ----
   1 Chess Tiger 14.0 CB 256MB Athlon 1200   2715   38   -36   378   66%  2600
   2 Deep Fritz 256MB Athlon 1200 MHz        2711   37   -35   390   63%  2618
   3 Gambit Tiger 2.0  256MB Athlon 1200     2696   40   -39   319   61%  2617
   4 Junior 7.0  256MB  Athlon 1200 MHz      2681   37   -36   377   59%  2619
   5 Shredder 5.32  256MB Athlon 1200 MHz    2664   34   -33   438   57%  2611
   6 Deep Fritz  128MB K6-2 450 MHz          2658   26   -25   773   64%  2558
   7 Gandalf 4.32h  256MB Athlon 1200 MHz    2647   35   -34   406   54%  2619

Notice the + and - columns after the rating column.  What these colums mean is
that the actual value of rating is between rating +(amount) and rating -(amount)
to withing a certain level of confidence (not sure if it is 67% or 95% here --
probably 67% if one standard deviation).

Anyway, what it means is that Chess Tiger 14.0 CB is probably from:
2715 + 38 to 2715 - 36 = (2753, 2679) ELO in strength.  It isn't a point -- it's
a range.

And it means that Deep Fritz is probably from:
2711 + 37 to 2711 - 35 = (2748, 2676) ELO in strength.

As you can see, 2748 is more than 2679 and so (very easily) Deep Fritz could be
stronger.  In fact, with the amount of data accumulated and with the very high
degree of similarity between the strengths of the top programs, it will take
literally millions of games before we have a strong idea about which one really
is strongest.

>I also thought I have been clear for my postings that I thought ChessTiger 14 to
>be the strongest program. Regardless of what SSDF was showing. If you remember
>we thought ChessTiger 14 would top the the last list, but it did not happen till
>now.

Keep in mind that your extrapolations might be spot on.  The only point I was
wanting to make is that the current list does not give evidence to support that.
 It shows that the top programs are of the same strength.

>>The SSDF data certainly does not support it.  The error bars are far in excess
>>of any difference in strength estimation.
>>
>>Not to say that the conclusions are not correct.  Only that the SSDF data does
>>not show this.  If anything, it shows that the top 3 programs are of the same
>>strength.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.