Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: ponder_on ponder_off comparision

Author: Robert Hyatt

Date: 15:04:44 07/18/00

Go up one level in this thread


On July 18, 2000 at 04:51:41, Mark Young wrote:

>On July 17, 2000 at 16:12:44, Robert Hyatt wrote:
>
>>On July 17, 2000 at 14:50:22, Chessfun wrote:
>>
>>>On July 17, 2000 at 11:18:18, Volker Pittlik wrote:
>>>
>>>>I've played two little tournaments.
>>>>
>>>>Programs: AnMon506, Arasanx_53, Amyf_000420, Bringer16, Comet_B21, Crafty1710,
>>>>EXchess314, Gromit_30, LambChop_710, Phalanx22, SOS_991103, TCB0045
>>>>
>>>>round robin, 20 games each match, time control 60+2
>>>>System: 2*520 MHz Celeron, NT4.0
>>>>Setting: ~32 MB hash each program, learning on, books coming with the program,
>>>>all 4- and some 5-piece TBs
>>>>All games can be found in my gamesArchie:
>>>>http://members.xoom.com/VolkerPi/gamesArchive/archive.htm
>>>>
>>>>Ponder=on
>>>>
>>>>No. Name          Win Draw Loss Unf.  Score Games       %    SOP
>>>>----------------------------------------------------------------
>>>>  1 Crafty1710   +140  =39  -41   *0  159.5   220   72.5% 5820.0
>>>>  2 AnMon506     +120  =36  -64   *0  138.0   220   62.7% 5470.0
>>>>  3 Amyf_000420  +111  =35  -74   *0  128.5   220   58.4% 4855.0
>>>>  4 Gromit_30    +106  =36  -78   *0  124.0   220   56.4% 6285.0
>>>>  5 TCB0045      +107  =34  -79   *0  124.0   220   56.4% 4520.0
>>>>  6 Comet_B21     +98  =42  -80   *0  119.0   220   54.1% 6080.0
>>>>  7 Phalanx22     +93  =38  -89   *0  112.0   220   50.9% 7140.0
>>>>  8 SOS_991103    +92  =36  -92   *0  110.0   220   50.0% 6925.0
>>>>  9 Bringer16     +85  =36  -99   *0  103.0   220   46.8% 5140.0
>>>> 10 EXchess314    +49  =45 -126   *0   71.5   220   32.5% 5140.0
>>>> 11 Arasanx_53    +48  =46 -126   *0   71.0   220   32.3% 3860.0
>>>> 12 LambChop_710  +43  =33 -144   *0   59.5   220   27.0% 4765.0
>>>>Total Games:    1320
>>>>White Wins:      596 (45.2%)
>>>>Black Wins:      496 (37.6%)
>>>>Draws:           228 (17.3%)
>>>>Unfinished:        0 (0.0%)
>>>>
>>>>
>>>>Ponder=off
>>>>
>>>>No. Name          Win Draw Loss Unf.  Score Games       %    SOP
>>>>----------------------------------------------------------------
>>>>  1 Crafty1710   +158  =36  -26   *0  176.0   220   80.0% 6105.0
>>>>  2 Gromit_30    +110  =42  -68   *0  131.0   220   59.5% 6460.0
>>>>  3 Amyf_000420  +106  =46  -68   *0  129.0   220   58.6% 4520.0
>>>>  4 AnMon506     +110  =37  -73   *0  128.5   220   58.4% 5475.0
>>>>  5 TCB0045      +100  =48  -72   *0  124.0   220   56.4% 4365.0
>>>>  6 Comet_B21     +94  =49  -77   *0  118.5   220   53.9% 6275.0
>>>>  7 Phalanx22     +96  =43  -81   *0  117.5   220   53.4% 7275.0
>>>>  8 SOS_991103    +96  =35  -89   *0  113.5   220   51.6% 6910.0
>>>>  9 Bringer16     +64  =43 -113   *0   85.5   220   38.9% 4945.0
>>>> 10 Arasanx_53    +69  =33 -118   *0   85.5   220   38.9% 3460.0
>>>> 11 EXchess314    +42  =30 -148   *0   57.0   220   25.9% 5500.0
>>>> 12 LambChop_710  +36  =36 -148   *0   54.0   220   24.5% 4710.0
>>>>Total Games:    1320
>>>>White Wins:      564 (42.7%)
>>>>Black Wins:      517 (39.2%)
>>>>Draws:           239 (18.1%)
>>>>Unfinished:        0 (0.0%)
>>>>
>>>>Conclusion
>>>>
>>>>The results of both tournaments are very similar. Tournaments with ponder=off
>>>>and ponder=on seem to give comparable results.
>>>>
>>>>Comments appreciated
>>>>
>>>>Volker Pittlik
>>>
>>>
>>>This is truly excellent stuff Volker !!.
>>>I note Crafty scored 10% approx worse with ponder=on.
>>>
>>>Was this a surprize to you?.
>>>What did you think prior to running the tests:
>>>ponder=on is a definate benefit to some programs?.
>>>
>>>This naturally could be a result of a few things:
>>>Number of games.
>>>Other programs also being "tuned" for ponder=on.
>>>
>>>But I still think your conclusion is the correct one.
>>>
>>>Thanks.
>>
>>
>>The conclusion is that this is a random result.  If you play crafty vs the
>>commercial programs, which are tactically stronger than most 'freeware'
>>programs, crafty will find itself in more 'trouble'.  And _that_ will make
>>the ponder=off problem I have explained show up.  If it doesn't get into any
>>trouble in most games, this problem won't show up.
>
>Never say die!! I know you are dead wrong here, as I have played the games as
>well as others and the results show you dead wrong... I do however admire your
>force of will on this subject.:)


The problem is that I am _not_ dead wrong.  I know the code I wrote inside-out.
And I _know_ that it has problems allocating time properly when pondering is
disabled.  There is _no_ doubt about that.

What I don't know is how much this same problem exists in other programs.  Which
means that when you play a match, there is one extra variable in the equation.
In science, that nullifies the experiment.  Did A win because it was better? or
because it was better tuned for the oddball ponder=off match?  With ponder=on,
the better program wins and there is no extra degree of freedom in the match
result...

It is possible that I will win more with ponder=off against some opponents.
It is possible that I will win less with ponder=off against some opponents.
It is possible that I will win and lose the same number of games with ponder=off
against some opponents.  But it is _definite_ that Crafty plays worse with
ponder=off.  No doubt about it at all.   If everyone plays equally worse, then
no harm is done.  But adding the extra degree of freedom is simply adding more
noise to an already complex and unstable system of comparing chess engines.

And that is fact, and not opinion...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.