Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Deep Junior's Debut

Author: Vincent Diepeveen

Date: 19:11:04 12/28/99

Go up one level in this thread


On December 28, 1999 at 15:47:18, Robert Hyatt wrote:

>On December 28, 1999 at 11:34:49, Djordje Vidanovic wrote:
>
>>On December 28, 1999 at 10:59:03, Vincent Diepeveen wrote:
>>
>>>On December 28, 1999 at 10:20:42, Djordje Vidanovic wrote:
>>>
>>>>I got Deep Junior 6 several days ago and decided to find out more about it by
>>>>staging a tournament with my currently strongest engines under the Deep Junior
>>>>GUI.  I decided on a round robin with programs playing 4 games against each
>>>>other. Time controls were G/25 (game in 25 minutes, sudden death), which is the
>>>>level most commonly used in rapid chess and the one quite likely to be used by
>>>>computer chess fans when playing their programs.
>>>>
>>>>The roster included two SMP programs -- Deep Junior and Crafty 16.15, and two
>>>>other super strong programs, Fritz (test version 6.66) and Hiarcs 7.32.  The
>>>>venue was my dualboard PII/400 machine. Each program used 32MB hash, and the
>>>>Nimzo 7.32 opening book.  Pondering and learning were off.
>>>
>>>Are you telling here that you are running 2 programs at 2 cpu's,
>>>so junior at 2 processors and crafty at 2 processors sometimes have
>>>a big problem that another program is eating up cpu time, thereby
>>>locking the whole process?
>>>
>>>Or did you use the right method involving 2 computers:
>>>  - dual PII400
>>>  - single cpu computer
>>>
>>>How did you do the test?
>>>
>>>If you run a parallel program at 2 cpu's
>>>against another program at the same cpu's, then
>>>the dual version of that program is having major problems,
>>>as it cannot search on as a processor sometimes gets blocked by another
>>>process. Thereby reducing the nodes a second and plydepths a program
>>>running parallel gets.
>>>
>>
>>I tested the programs on a single computer, using a very simple method to ensure
>>that they can play in a more or less fair manner.  Pondering was off and I
>>checked the CPU utilization via the Task Manager in Win 2000.  There was no CPU
>>hogging, nor were the processes blocked.  The nodes were evenly distributed, and
>>reached the same heights as when I used only one SMP program. Of course, you
>>have a point that this is not the best way to test programs.  However, my other
>>computer is a single CPU comp and I could not test the SMP programs there.
>>
>>*** Djordje
>
>
>You had better check again.  Once Crafty starts an SMP search, it fires up a
>second process for the second CPU. _this_ process will _never_ wait or be
>idle.  It will be burning the CPU forever unless you can somehow make windows
>not give it any CPU time.
>
>From what I know, 1 cpu was toasted 100% of the time.  Of course, crafty vs a
>non-SMP program would be fine as the second cpu would be free for the other
>program.  But crafty vs deep junior would very definitely _not_ be ok as Crafty
>would eat 1/2 of one cpu while the other program is thinking.  Not a good idea.
>
>As always...  testing on one machine is simply bad.  In this case, unless you
>took specific actions to stop the spinning thread from Crafty, it is worse than
>bad...

Currently Diep is even worse than this Bob,
Diep burns both cpu's to hell. Windows NT nor Linux recognize
next loop as an idle loop:

  while( tree->idle[ThisProcess] ) {
  };

If i wouldn't use this loop but use the system functions, then
diep would take like a second extra at least to start searching.
That means it would forfeit a lot of blitz games. Nearly all fast
blitz games actually...

I don't doubt that the other program is having the same behaviour,
so that program will burn system time too when crafty/deep junior
are searching.

That mean all locks get wrong done.

SMP is fun, unless you fire more realtime processes to the cpu.

Suppose the OS is blocking a process at CPU_1 for about 1/100 of a second,
then gives process SMPA1 system time, then process B (nonsmp) system time
for 1/100 of a second. CPU_2 is completely reserved for SMPA2.

Now the big problem of SMP programming for chess is that in order to
get a good speedup you regurarly need to lock the root or a node.

Suppose that now SMPA1 and SMPA1 get CPU_1 and CPU_2 for 1/100 of a second
both at the same time.

Now this is of course the 'ideal' case which only happens for 1/100 second.

SMP program will not get blocked then. However after 1/100 second the OS
decides that at CPU_1 the process B gets some system time.

So SMPA1 gets blocked and B runs on that cpu.
However after searching a few nodes, SMPA2 will see a split or the root
or the hash or whatever is locked. Most likely it will see in most cases
a split somewhere that's locked. In a number of cases it will directly
see for example the root or something else locked, but it *definitely*
will see a split that's locked. So it can't get further. How many nodes?
Well that depends upon where the program had split. If it splitted near
the leafs, then you run within a few nodes into troubles. Now we must
keep in mind that the vaste majority of splitpoints in crafty is quite
near the leafs. I don't doubt that in junior the same thing happens.

So nearly for 1/100 of a second SMPA will not be able to search further.
Process SMPA2 must wait till SMPA1 is unlocking, which normally spoken
would have occured a long time ago...

This was the problem simplified. When calculating you will say from above:
"this is no problem, as effectively SMPA searches 1/100 of a second,
then B searches 1/100 of a second. This is exactly what i want to!".

HOWEVER. This was simplified.

Now let's take a look to the OS. In how the OS might divide the system time.
The OS sees simply 3 processes running:
    - A
    - SMPA
    - SMPB

Now despite all kind of good stories about how well parallellism should work,
my personal opinion is that the OSes aren't that well designed for SMP.

The OS is using a simple FIFO way of dividing system time.
Let's do a try to calculate how the FIFO is working with 3 processes and
2 processors.

At the same time we have horizontal the processes that are blocked and
active.
Vertical time units:

blocked  running  running
A        SMPA     SMPB
SMPA     A        SMPB
SMPB     A        SMPA
A        SMPB     SMPA

Ok now it's not difficult to see that in this model only at 1/3 of the
time both SMPA and SMPB are running, which is the only state in which
there is no chance that things go wrong.

Vincent




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.