Subject: Re: OK, we can make a test ...

Author: Dieter Buerssner

Date: 18:06:00 08/01/01

On July 31, 2001 at 23:19:48, Robert Hyatt wrote:

>On July 31, 2001 at 21:38:58, Dieter Buerssner wrote:
>1.  ponder=off is a poor way to play games.

I disagree. I think, it is the best thing that can be done, when only one
computer is available for engine matches. I even think, that results obtained by
this will more or less be very similar to results obtained by using two
computers. At least, I have not seen any evidence yet, that shows that this is
not the case.

>2. ponder=on makes more sense to me on a single machine, and I test like this
>all the time.

Not for me. It is too fragile in my opinion. One has to check very carefull CPU
usage. On my computer, it seems impossible to get fair distibution of computer
resources often. Sure, to take our engines as an example: Under Winboard and
Xboard, I get fair usage of the resources in games Crafty-Yace. But this is
already not the case, when I test Crafty (CB-native version) vs. Yace under te
Fritz 6 GUI. It is also not the case between all WB engines under Winboard. I
think, I am able to detect the cases, when something goes wrong. I believe, this
is not the case for many computer chess enthuisiasts, that have fun playing
engine matches on one computer. So, I will suggest to use ponder off for this.
Much less can go wrong. Sure, when you show me evidence, that ponder off will
hurt some engines significantly more than other engines, I will reconsider.

And the other argument for ponder off is, that computer resources are used more
effectively. Even when you have a ponder-hit rate of 80%, much of the other 20%
will be wasted. Sure, the hash tables will get filled. But it is certainly less
effective, than doing a real search for the right move allways.

>Both engines are 100% compute-bound, which means each machine
>gets 1/2 the total processor time.  It is like using two machines, with each
>being 1/2 as fast as the actual machine.

In my experience, this is not allways the case.

>Either way, you take a chance on producing results that are not comparable to
>the results produced on two separate machines.

You think, they are not comparable. All data I have seen (especially by Volker),
seems to indicate, that they are comparable.

>>With the same right, the "one computer may get sensible results" fraction can
>>say, that you should show evidence, that engine matches on one computer, with
>>ponder off will show no reasonable results. So, they can ask you, to show an
>>example, where it really makes a difference. How to solve this question?
>I won't take the time to find an example, any more than I will take the time
>to carefully tune crafty's time allocation for ponder=off matches.  It is time
>that is wasted, and I don't have a lot of time to use in general....

Sure, I can respect this. But why are you so convinced, that results are not
comparable, without having any data to support your claim?

>Two machines has _none_ of the above issues.

I totally agree. However, I don't have two machines (OK, I have. One is a
486-66Mhz, that just runs plain DOS, and really is of no use here). I would
guess that most readers/members here are in a similar situation, and don't have
two comparable computers.

>Why do you think all the commercial authors have machines
>connected doing auto play all day long?  Rather than using each machine to run
>both programs at the same time with either ponder=off or on?

Sure, if I had the computing resources of the commercial developers, I would run
tests on different machines for each opponent. I don't have the coice. So, I run
on one machine, and I think, I still get valuable results.

>Have you never modified the time allocation code after watching it get into a
>bit of time trouble with ponder=on?  Or after seeing it reach a time control
>with too much time left on the clock?

Actually, I think I didn't.

>How do you play games on the servers?  On or off?

I don't play on servers personally. Sure, the people, how test Yace on servers
use ponder on

>How do you play test games to observe the program?  On or off?

Off. Very rarely, I play 2 games under some GUIs with ponder on. Just to see, if
basic functionality is working, before I make a new release publically

My primary testing mode, is ponder off on a computer with one CPU.

>I will bet the answers are yes, on and on... respectively.

1+ 1= 1-. I fear, you won't win the bet :-)


