Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Chess program similarity experiment (Results)

Author: Robert Hyatt
Date: 19:37:33 01/29/99
On January 29, 1999 at 10:45:51, Albrecht Heeffer wrote:

>On January 29, 1999 at 09:23:15, G.Mueller wrote:
>
>>On January 29, 1999 at 08:28:26, Robert Hyatt wrote:
>>
>>>On January 29, 1999 at 02:31:21, Albrecht Heeffer wrote:
>>>
>>>>On January 28, 1999 at 15:56:40, Robert Hyatt wrote:
>>>>
>>>>>note that some of the positions end up in tablebases.  I don't know what they
>>>>>used, but I used none, because of the format change from 15.20 to 16.0.  As a
>>>>>result, I ran with no tablebase hits possible...
>>>>
>>>>We did use the old tablebases (72). In the log files you can see the probes.
>>>>In fact we probably saved the Kallisto game from a draw by the tablesbases.
>>>>The endgame was was winning but Bionic did not play the best moves until
>>>>suddenly:
>>>>
>>>>               13     6.31  fhigh   Kd6!!
>>>>               13->   7.20   6.62   Kd6
>>>>               14     7.58  fhigh   Kd6!!
>>>>               14->  13.08   7.01   Kd6
>>>>               15    13.26  fhigh   Kd6!!
>>>>               15    17.08  Mat28   Kd6 Kg7 Ke6 f5 Bf3 Kh6 Kf6 Kh7 Bc6
>>>>                                    Kh6 Bd7 <HT>
>>>>              time=23.65  cpu=196%  mat=2  n=9446180  fh=98%  nps=399415
>>>>              ext-> checks=236796 recaps=13251 pawns=234693 1rep=184173
>>>>              predicted=37  nodes=9446180  evals=3370257
>>>>              endgame tablebase-> probes done=137165  successful=137159
>>>>              hashing-> trans/ref=111%  pawn=99%  used=w29% b37%
>>>>              SMP->  split=2503  stop=80  data=8/64  cpu=46.52  elap=23.65
>>>>
>>>>mate in 28 moves.
>>>>
>>>>>If someone has a PII/400 they can run on, I can run crafty on one processor at
>>>>>1 minute per position, and they could run bionic the same way.  A PII/400 and
>>>>>my xeon/400 are close to the same speed.  that would be the best comparison
>>>>>unless one person wants to run both bionic _and_ wcrafty_15.20.exe (from my
>>>>>ftp site) on the same box, which would be even better...
>>>>>
>>>>>I'd much prefer that to eliminate all variables.  the SMP code produces some
>>>>>odd results at times, and my linux version is 15% slower than the corresponding
>>>>>windows executable due to MSVC being a better compiler.  All in all, too many
>>>>>different things...
>>>>
>>>>The results so far do not seem to indicate that Crafty 15.20 plays all
>>>>the same moves as the Bionic games in the Dutch Open. I'm still wondering
>>>>how you were able to reproduce all the right moves in three games with
>>>>Crafty 16.1 on your hardware. Did you use SMP then?
>>>>
>>>>Albrecht Heeffer
>>>
>>>
>>>No.  What I did was search for 10 minutes per position, and if the
>>>programs matched after a minute or more, I counted it as a match.  I
>>>then looked at the very few where they didn't match.  In a couple of
>>>cases there were simple transpositions that evaluated to the same score
>>>and pre-processing could affect that by changing the root move order.
>>>If crafty's move and bionic's move had the same score, and roughly the
>>>same PV I counted those as matches.  Finally, in a couple of cases it
>>>was obvious that they had the same idea, just different ways to reach
>>>that...  The pv's were similar but in different order (notably in a
>>>couple of endgame positions where there were no real tactics to consider.)
>>>
>>>But notice that 'bionic' is not matching 'bionic-tournament' very well,
>>>when you think about it.  The web site version matched 77% of the moves,
>>>while version 15.20 matched 74%.  Everyone else is well back from that.
>>>And I was searching faster than the bionic in the tournament, and the
>>>one tested here, which is why I suggested someone run 15.20 and bionic
>>>on the _same_ hardware and do this test.  That would be better...
>>Hello Bob!
>>
>>You are right I do remark same on my single Xeon overclockt to 504 Mhz with 1MB
>>2.level Cache, the downloadable bionic is never the same version that played in
>>Dutch Championchip, it matches about 80%, Crafty 16.1 in about 75%, do not test
>>with TB. A download from the originally Dutchversion would be very nice to
>>answer all this difficult questions.
>>In Dualuse (i test on a Quadxeon with mt=2) Bionic seems evident fast as Crafty
>>16.1 on single it is slower. But a Quadmachine is not really good for testing a
>>dual machine I know, but bionic is compiled with two CPUS.
>>
>>Best wishes to you Bob!
>>G.Mueller
>
>After the tournament I burned a CD-R with the complete directory image
>of all sources, executables and log files. The version on the website
>'bionic41.exe' is ftp'd directly from the CD onto the web site. It _is_
>the same version as used during both weekends of the tournament, I
>can assure you. The SMP code must be very indeterministic if we can't
>reproduce the moves somehow.
>
>Albrecht Heeffer

You are going to have the following problems and there is _nothing_ that can
be done:

Some moves are going to be different because of transpositions.  Same score,
but different move first in the PV.  I catch these by 'hand' when I go thru
the comparison, since I know to expect this problem.

parallel speedup is a non-deterministic problem..  That is why when Vincent
originally asked me to check, I ran the test positions for 10 minutes which
I figured was longer than your searches by a good bit (I ran tests on my quad
P6 at the time).  Running longer gives a good chance to offset a bad parallel
search that might slow it down excessively on some random occasions.  This
test approach eliminates that.

A very few move are _never_ going to be reproduced.  I'd expect maybe one such
move every 3-4 games...  maybe a little less frequently. But I had this problem
in Cray Blitz, and I see it in Crafty.  No solution...

But it would be a reasonable comparison to take the two programs on _one_ cpu to
get rid of the non-determinism, and run them both on the same input.  to 2
minutes or so, and then compare, if they match (one finds X at 1 minute, the
other at 1:20) we'd say 'same'.  The extra minute allows for slight timing
differences in the two while holding depth to a 'similar' level...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.