Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Questions on dual machines

Author: Slater Wold

Date: 08:05:00 11/21/01

Go up one level in this thread


On November 21, 2001 at 09:26:11, Robert Hyatt wrote:

>On November 21, 2001 at 01:52:04, Slater Wold wrote:
>
>>On November 21, 2001 at 00:30:15, Robert Hyatt wrote:
>>
>>>On November 20, 2001 at 21:58:57, Slater Wold wrote:
>>>
>>>>On November 20, 2001 at 21:50:32, Uri Blass wrote:
>>>>
>>>>>On November 20, 2001 at 15:37:39, Slater Wold wrote:
>>>>>
>>>>>>On November 20, 2001 at 11:25:50, Gordon Rattray wrote:
>>>>>>
>>>>>>>
>>>>>>>[snip]
>>>>>>>
>>>>>>>Thanks for the helpful info!
>>>>>>>
>>>>>>>>This is the speedup I see:
>>>>>>>>
>>>>>>>>Crafty 1.89x
>>>>>>>>Junior 7 1.81x
>>>>>>>>Deep Fritz 1.31x
>>>>>>>>Deep Shredder 1.81x
>>>>>>>
>>>>>>>This is a surprising and disappointing efficiency for Deep Fritz.  So, when
>>>>>>>playing on ICC, do you consider Deep Junior 7 to be your strongest option?  I'm
>>>>>>>assuming that if you have, e.g. Gambit Tiger, then Junior's SMP capability will
>>>>>>>give it a significant edge when using your dual, since GT is non-SMP.
>>>>>>>
>>>>>>>Gordon
>>>>>>
>>>>>>I thought so too.  Deep Fritz SMP code is broken somewhere.  That's why I
>>>>>>laughed when I heard it was going to be on an 8-way box.  It would have run like
>>>>>>crap.  Unless Frans fixed it.
>>>>>
>>>>>The question for the match against kramnik is the speed up that they get on long
>>>>>time control and not in blitz.
>>>>>I do not know how people got the numbers of speedup for Crafty,Fritz ,Junior and
>>>>>Shredder
>>>>
>>>>A 900mhz 8-way box is not going to be impressive with DF.  Not the NPS anyway.
>>>>
>>>>And those are all MY numbers.  Run on my 2x1.4Ghz.
>>>>
>>>>>I think that the way to compare is comparing times and not nodes.
>>>>
>>>>I know it's not.  You can *NOT* compare solutions with SMP machines.  The
>>>>branching is SO random, and so unpredicible, that I have found solutions in 10
>>>>seconds and not been able to find the same solution in 10 hours.  It's the
>>>>beauty of SMP.
>>>>
>>>>>You need to take a test suite from positions when the program changes it's mind
>>>>>after some minutes and comparing times.
>>>>
>>>>No.  Won't prove anything.  Say it takes 10 minutes to find on 1 CPU, it might
>>>>take 30 seconds to find on 2 CPU's.
>>>>
>>>>>If the numbers are not based on similiar test then my opinion is that they mean
>>>>>nothing.
>>>>
>>>>They were based on something.  Program A does 1M nps with 1 CPU.  Program A does
>>>>1.81M nps with 2 CPU's.  That means Program A's speedup is 1.81.
>>>
>>>No..No..NOOOOO!
>>>
>>>NPS has _nothing_ to do with speedup.  Here are three example runs...
>>>
>>>
>>>1cpu:  time: 53  nps:  328K
>>>2cpu:  time: 28  nps:  626K
>>>4cpu:  time: 16  nps: 1162K
>>>
>>>if you use time to compute the 2/4 cpu speedup, you get
>>>
>>>1.89 for 2 processors
>>>3.31 for 4 processors
>>>
>>>if you use nps, you get
>>>
>>>1.90 for 2 processors
>>>3.54 for 4 processors
>>
>>Well, for all intended purposes, that's not really a *major* difference.
>
>The 4 processor test is significant.  But even worse, I only ran one
>position one time for the above.  There are positions that produce almost
>4x the NPS but only a 2.5X speedup in terms of time.  This is called "search
>overhead" (searching nodes in the parallel search that would not be searched
>in the sequential search.)

As I have found positions where the NPS search is 2.5x faster, but it solves the
solution in 4x faster than a single cpu.

Dann and I had this "super" linear discussion before.

Seems like it would even out, eventually.  But like I said, I believe you.  And
I'll do it to solution now.  (But of course, I'll still look at the NPS!)  :)

>
>
>
>>
>>But I believe you.  I will do it this way from now on.  :)
>>
>>>NPS is the wrong thing to use because an SMP program will _always_ search
>>>more nodes than a pure sequential program for a given position, except for
>>>rare anomalies.  IE on a machine with no memory bandwidth bottleneck, Crafty's
>>>NPS is roughly 4X faster with 4 cpus, but it averages just over 3x faster if
>>>you look at the clock to see how long it takes to find a key move at a specific
>>>depth, or how long it takes to complete a search to a specific depth.
>>>
>>>Ignore NPS and _only_ use the time to solution or time to depth to compare
>>>speeds.  Anything else will produce wildly wrong numbers.
>>>
>>>
>>>>
>>>>>Testing it takes time and you need at least some hours of testing before getting
>>>>>an estimate for the speedup (not in blitz).
>>>>
>>>>These were SEVERAL SEVERAL tests I ran.  Positions were usually looked at for no
>>>>less than 1 hour.
>>>>
>>>>>Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.