Author: Dann Corbit
Date: 15:36:09 01/05/06
Go up one level in this thread
On January 05, 2006 at 07:05:19, Joseph Ciarrochi wrote: >Thanks for your comemnts. > >I should note that i teach statistics at the university level and certianly >understand what you are saying. > >An additional interesting question is this: Is the difference between the CEGT >fritz and fruit rating significantly different than the difference between the >SSDF fritz and fruit rating? The cegt has huge numbers and it is just possible >that the difference between differences is significant. > >I guess my main question was, are there any differences in testing conditions, >that might lead to a significant difference in relative ranks.? Here are the CEGT conditions: ================================================================================ Conditions Time control and hash: CEGT Games are medium time control games 40/40 repeated and blitz time control games 40/4 repeated. The meaning of 40/40 is 40 moves in 40 minutes and another 40 minutes for moves 41 to 80 and so on. For CEGT Blitz 40/4 meaning 40 moves in 4 minutes repeated. Given the different hardware from testers we agreed to adapt to 2 Ghz Pentium CPU. Some examples: for a machine Athlon64 3500+ this comes down to 40 moves in 18 minutes. A tester with Pentium 800 Mhz has to give full two hours for every 40 moves. Hash given is usually 256 MB for each engine. Very few testers who have less RAM available are allowed to give 128 MB. Deep versions: Deep Shredder 9. Deep Fritz 8, Deep Junior 9 and others are tested on dual machines using 2 CPU´s and 512 MB hash. There is an exception for Junior 9.003 using only 256 MB, because there seem to occur bugs when giving 512 MB to this one. Books: In the first months of CEGT all Nunn Suite 1 and 2 positons were used and also many from Noomen Select. Currently we use mainly books like 8move.ctg. remis.ctg, Perfect books, Powerbooks, Master Elect and Arena books mainly by Harry Schnapp. We have started now to use in a bigger extend a testsuite with 220 positions by Harry Schnapp. Thanks to Harry for this one! Tablebases: Most Testers use 5 men EGTB. Some use only 4 men. Testers using 5 men give 32 MB EGTB hash. Testers using 4 men give 16 MB EGTB hash. GUI´s: All testers use one or more different GUI´s. Most used are Shredder 9 GUI, Arena and Shredder Classic GUI. Chess Partner GUI and Winboard can also be used. Not used are buggy GUI´s like Fritz 9, Fritz 8 with server update, known buggy UCI.dll´s. Adjudications: Testers and GUI´s are allowed to adjudicate totally won or drawn games Benching: to adapt the different hardware of testers in CEGT to a standard (currently 40/40 and 40/4 on P4 2Ghz reference machine from Uschi) we perform a benchmark with a Bryan Hofmann Crafty compile. Bryan Hofmann and Johan Havegheer also calculated the according table for time controls to use with different CPU´s. Charles sends both (compile and table) out to new testers. Just put the exe in an empty folder (do not include a Crafty.rc or a book) and make sure that you only have the necessary tasks running in the background - best perform a reboot beforehand. Then click the exe and type in bench at the command line. Wait for around 40 to 120 seconds. A logfile will be created in the folder. Amongst other values at the bottom you will find the seconds needed and can just compare the time you have to give with your CPU in order to adapt. This is a repeated time control! In ChessBase GUI´s for example you can give 0 (zero) for all values for second and third time control and this way the first time control like for example 40/24 is always repeated. For Blitz 40/4 just divide by ten. table 70/40 182-186 68/40 177-181 66/40 172-176 64/40 167-171 62/40 162-166 60/40 157-161 58/40 152-156 56/40 147-151 54/40 142-146 52/40 137-141 50/40 132-136 48/40 127-131 46/40 122-126 44/40 117-121 42/40 112-116 40/40 107-111 38/40 102-106 36/40 97-101 34/40 92-96 32/40 87-91 30/40 82-86 28/40 77-81 26/40 72-76 24/40 67-71 22/40 62-66 20/40 57-61 18/40 52-56 16/40 47-51 14/40 42-46 12/40 35-41 10/40 32-36 8/40 27-31 6/40 22-26 4/40 17-21 2/40 11-16 0/40 6-11 ================================================================================ The SSDF does 40 moves in 2 hours, repeating. The time control does not change for one set of hardware to the next, so they use standardized hardware, hash, etc. The SSDF has ponder on, so the engine is thinking even when the opponent is thinking (that is to say, each chess engine gets its own dedicated CPU that is allowed to think all the time. The SSDF uses the standard features of the engine such as the engine's own opening books, endgbase tablebase files, GUI, and etc. The games are generally played using the serial port and AUTO232 interface. If such an interface is not available, then the games are played MANUALLY (e.g. with the dedicated units or handheld units the moves are made by hand by a person operating both systems). Perhaps Tony Hedlund or Thoralf Karlsson or another of the SSDF people can add more details about the conditions. ================================================================================ At any rate, the conditions are exceendingly different both in the terms of hardware calibration, time control, data resources, etc. that the engines have access to. The lists answer different questions. The fact that the lists tend to agree almost 100% (according to a graph recently posted to CCC is something of a small surprise to me. I would have though (for instance) that the opening books would have made a bigger difference.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.