Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF Rating List 2006-01-03

Author: Dann Corbit

Date: 15:36:09 01/05/06

Go up one level in this thread


On January 05, 2006 at 07:05:19, Joseph Ciarrochi wrote:

>Thanks for your comemnts.
>
>I should note that i teach statistics at the university level and certianly
>understand what you are saying.
>
>An additional interesting question is this: Is the difference between the CEGT
>fritz and fruit rating significantly different than the difference between the
>SSDF fritz and fruit rating? The cegt has huge numbers and it is just possible
>that the difference between differences is significant.
>
>I guess my main question was, are there any differences in testing conditions,
>that might lead to a significant difference in relative ranks.?

Here are the CEGT conditions:
================================================================================
Conditions

Time control and hash:

CEGT Games are medium time control games 40/40 repeated and blitz time control
games 40/4 repeated. The meaning of 40/40 is 40 moves in 40 minutes and another
40 minutes for moves 41 to 80 and so on. For CEGT Blitz 40/4 meaning 40 moves in
4 minutes repeated. Given the different hardware from testers we agreed to adapt
to 2 Ghz Pentium CPU. Some examples: for a machine Athlon64 3500+ this comes
down to 40 moves in 18 minutes. A tester with Pentium 800 Mhz has to give full
two hours for every 40 moves.
Hash given is usually 256 MB for each engine. Very few testers who have less RAM
available are allowed to give 128 MB.
Deep versions: Deep Shredder 9. Deep Fritz 8, Deep Junior 9 and others are
tested on dual machines using 2 CPU´s and 512 MB hash. There is an exception for
Junior 9.003 using only 256 MB, because there seem to occur bugs when giving 512
MB to this one.

Books:

In the first months of CEGT all Nunn Suite 1 and 2 positons were used and also
many from Noomen Select. Currently we use mainly books like 8move.ctg.
remis.ctg, Perfect books, Powerbooks, Master Elect and Arena books mainly by
Harry Schnapp. We have started now to use in a bigger extend a testsuite with
220 positions by Harry Schnapp. Thanks to Harry for this one!

Tablebases:

Most Testers use 5 men EGTB. Some use only 4 men. Testers using 5 men give 32 MB
EGTB hash. Testers using 4 men give 16 MB EGTB hash.

GUI´s:

All testers use one or more different GUI´s. Most used are Shredder 9 GUI, Arena
and Shredder Classic GUI. Chess Partner GUI and Winboard can also be used. Not
used are buggy GUI´s like Fritz 9, Fritz 8 with server update, known buggy
UCI.dll´s.

Adjudications:
Testers and GUI´s are allowed to adjudicate totally won or drawn games

Benching:
to adapt the different hardware of testers in CEGT to a standard (currently
40/40 and 40/4 on P4 2Ghz reference machine from Uschi) we perform a benchmark
with a Bryan Hofmann Crafty compile. Bryan Hofmann and Johan Havegheer also
calculated
the according table for time controls to use with different CPU´s. Charles sends
both (compile and table) out to new testers. Just put the exe in an empty folder
(do not include a Crafty.rc or a book) and make sure that you only have the
necessary tasks running in the background - best perform a reboot beforehand.
Then click the exe and type in bench at the command line. Wait for around 40 to
120 seconds. A logfile will be created in the folder. Amongst other values at
the bottom you will find the seconds needed and can just compare the time you
have to give
with your CPU in order to adapt. This is a repeated time control! In ChessBase
GUI´s for example you can give 0 (zero) for all values for second and third time
control and this way the first time control like for example 40/24 is always
repeated.
For Blitz 40/4 just divide by ten.

table
70/40 182-186
68/40 177-181
66/40 172-176
64/40 167-171
62/40 162-166
60/40 157-161
58/40 152-156
56/40 147-151
54/40 142-146
52/40 137-141
50/40 132-136
48/40 127-131
46/40 122-126
44/40 117-121
42/40 112-116
40/40 107-111
38/40 102-106
36/40 97-101
34/40 92-96
32/40 87-91
30/40 82-86
28/40 77-81
26/40 72-76
24/40 67-71
22/40 62-66
20/40 57-61
18/40 52-56
16/40 47-51
14/40 42-46
12/40 35-41
10/40 32-36
8/40 27-31
6/40 22-26
4/40 17-21
2/40 11-16
0/40 6-11
================================================================================

The SSDF does 40 moves in 2 hours, repeating. The time control does not change
for one set of hardware to the next, so they use standardized hardware, hash,
etc.

The SSDF has ponder on, so the engine is thinking even when the opponent is
thinking (that is to say, each chess engine gets its own dedicated CPU that is
allowed to think all the time.

The SSDF uses the standard features of the engine such as the engine's own
opening books, endgbase tablebase files, GUI, and etc.

The games are generally played using the serial port and AUTO232 interface.  If
such an interface is not available, then the games are played MANUALLY (e.g.
with the dedicated units or handheld units the moves are made by hand by a
person operating both systems).

Perhaps Tony Hedlund or Thoralf Karlsson or another of the SSDF people can add
more details about the conditions.
================================================================================

At any rate, the conditions are exceendingly different both in the terms of
hardware calibration, time control, data resources, etc. that the engines have
access to.

The lists answer different questions.  The fact that the lists tend to agree
almost 100% (according to a graph recently posted to CCC is something of a small
surprise to me.  I would have though (for instance) that the opening books would
have made a bigger difference.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.