Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Crafty 17-10 v Fritz 6a Nunn 1 @ 120'/40 + 60'/20 + 30'

Author: Robert Hyatt
Date: 05:42:02 04/28/00
On April 27, 2000 at 22:29:15, Mark Young wrote:

>On April 27, 2000 at 11:48:24, Enrique Irazoqui wrote:
>
>>On April 27, 2000 at 02:16:48, Chessfun wrote:
>>
>>>
>>>All games on one Cel 433.
>>>Ponder=off.
>>>Tablebases used at 25 and longer.
>>>Nunn 1 positions.
>>>No opening books are loaded.
>>>
>>> 1 min game    Fritz 6a 14.5 - 5.5 Crafty 17-10
>>> 2 min game    Fritz 6a 14.5 - 5.5 Crafty 17-10
>>> 3 min game    Fritz 6a 13.0 - 7.0 Crafty 17-10
>>> 5 min game    Fritz 6a 15.5 - 4.5 Crafty 17-10
>>>10 min game    Fritz 6a 13.0 - 7.0 Crafty 17-10
>>>25 min game    Fritz 6a 13.0 - 7.0 Crafty 17-10
>>>60 min game    Fritz 6a 15.5 - 4.5 Crafty 17-10
>>>Tourney times  Fritz 6a 14.5 - 5.5 Crafty 17-10
>>>
>>>Once I figure out how to autoplay from the Nunn 1 opening
>>>positions I will play all games with ponder=on.
>>>
>>>Thanks.
>>
>>I find what you are doing very interesting, not so much concerning the relative
>>strength of Fritz and Crafty, which a series of matches between them can't
>>determine anyway, but regarding testing procedures.
>>
>>I ran 2 similar matches myself between Crafty 17.10 and Fritz 6a in the GUI of
>>Fritz 6a, with the General book of F6 (Alex Kure), tablebases on (2.6 GB of the
>>Turbo CDs) and 4MB hash for each engine. The first match was played on 2 P600E,
>>Game/1 minute, ponder on. The second, on a single P600E, ponder on and game/2
>>minutes, since each program is using 50% of the CPU.
>>
>>In the first match, Fritz 6a won 15-5, which is consistent with your 14.5-5.5
>>and seems to indicate that ponder off doesn't hurt Crafty more than Fritz.
>>
>>In the second match, Fritz 6a won also 15-5, which is consistent too, even on
>>one machine and ponder on.
>>
>>Enrique
>
>It has been clear for some time that Crafty is not hurt more then other programs
>when playing with ponder off. I know Bob is a expert, but data has never shown
>him to be correct in this matter, and I will always take data over opinion.


I didn't give opinion.  I gave _fact_.  It was fact discovered in several games
sent to me at least a year ago with the question "why did this happen?"  I
looked at the logs that were sent, and found that it was getting into time
trouble on many occasions (the games were not ICC games, they were traditional
time controls (ie 40/120, 20,60).  I didn't initially notice that pondering was
off, but when I skimmed thru the log the first thing I found was that at some
point, the eval would plummet.  It generally happened right around the 40 move
point, or the 60 move point, etc.  And it happened because it would have very
little time right at the end of a time control, and it would do some very quick
(and shallow) searches and make a mistake.  It didn't happen in every game, but
it happened in enough.

IE if you take the current SSDF result between Fritz and Crafty, which was 16-12
last time I looked, and take the two games Crafty lost due to book busts, then
suddenly the match is dead even if crafty had been the 'buster' rather than the
'bustee'.  It doesn't take many 1's becoming 0's to _really_ swing a result.  I
know it happens to Crafty.  It _may_ happen to Fritz for all I know.  And if so,
then it is pretty much a wash.  But against the program I looked at, it was not
happening.  (I believe it was either hiarcs or CSTal, but I do not remember for
sure).

So what I said is _not_ opinion.  It is demonstrable fact.  Just play it in a
40/2 game without pondering.  But against an opponent that won't get killed
so that on a few occasions it will 'fail low' and have to search overtime to
find a better move.  Every fail low hurts it.  Because it burns time it has no
chance of getting back.  Yet the fail low overtime is tuned with the idea that
it _will_ recover some of that, so that it can be more aggressive in using more
time to avoid what might turn into a quick loss when it wasn't forced.

There is a _big_ difference between opinion and fact.  My comments were based
on facts learned by actually looking at the output of the program.  And getting
to the root cause of the problem I found.  Just because you play A vs B with
pondering on, and then A vs B with pondering off, does _not_ mean that neither
A nor B are hurt by pondering off if the match results are the same.

_that_ is bad science.  Match results vary enough between two engines, with no
changes at all.  Drawing conclusions like this is unsound.  It is highly
probable that ponder=off might actually win a game here and there because some-
times snap judgement is better than a deep search.  But pick the wrong set of
games and you could conclude that "we ought to play with ponder off all the time
because the program does better."

Believe my comments or not.  That is your choice.  But whether you believe them
or not, they are _not_ "opinion".  They are based on facts learned by going over
about 30 log files.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.