Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: DEEP BLUES AVERAGE PLY?

Author: Vincent Diepeveen

Date: 07:27:28 08/25/02

Go up one level in this thread


On August 24, 2002 at 18:31:07, Uri Blass wrote:

>On August 24, 2002 at 17:59:43, Robert Hyatt wrote:
>
>>On August 23, 2002 at 12:24:45, Vincent Diepeveen wrote:
>>
>>>On August 22, 2002 at 20:25:24, Robert Hyatt wrote:
>>>
>>>>On August 22, 2002 at 18:22:56, Uri Blass wrote:
>>>>
>>>>>On August 22, 2002 at 18:01:09, Robert Hyatt wrote:
>>>>>
>>>>>>On August 21, 2002 at 20:10:26, Mike S. wrote:
>>>>>>
>>>>>>>On August 21, 2002 at 11:07:58, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>(...)
>>>>>>>>1.  They reported depth as 11(6) for example.  According to the deep blue
>>>>>>>>team, and regardless of what others will say about it, this supposedly means
>>>>>>>>that they did 11 plies in software, plus another 6 in hardware.
>>>>>>>
>>>>>>>When I looked at some of the logs, I had the impression that "11(6)" was
>>>>>>>reported most often, IOW. we can probably say that it was the *typical* search
>>>>>>>depth reported (except additional extension depths we do not know), in the
>>>>>>>middlegame, 1997. Would you agree with that from your study of the logs?
>>>>>>>
>>>>>>
>>>>>>I thought so.  But since the paper quotes 12.2, that would mandate that 12
>>>>>>must come up more often that 11.  I haven't gone thru each log in that kind
>>>>>>of detail as that is a recipe for a headache.  :)
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Another thing I'm not sure of is: *When* could relatively safely be claimed,
>>>>>>>that DB.'s depth is reached again:
>>>>>>>
>>>>>>>a) when a current prog reaches at least 16 plies as a typical middlegame depth,
>>>>>>>   because some search techniques used now (which DB. didn't use), make up for
>>>>>>>   the missing ply (at least), or
>>>>>>>b) when 17 plies are reached, not earlier, or
>>>>>>>c) a program would have to reach more than 17 plies, because DB used much more
>>>>>>>   knowledge which current software probably does not yet use to that extent.
>>>>>>>
>>>>>>>I search for expert's opinions of *when* we can say something like "Yes, now
>>>>>>>with this specific performance [## plies etc.] we can safely say - as it's our
>>>>>>>*best guess*, since no direct head-to-head match is possible - that this new
>>>>>>>chess computer is better than Deep Blue was."
>>>>>>
>>>>>>I don't see any real way to do this.  IE take the following types of
>>>>>>programs and try to compare depths:
>>>>>>
>>>>>>1.  Junior, which uses a different definition of ply than everyone else.
>>>>>>They appear to search _much_ deeper than anyone else, based only on this,
>>>>>>but Amir has explained how he counts plies, and the bottom line is that
>>>>>>raw ply depth can't be compared.
>>>>>>
>>>>>>2.  Very dumb and fast program, with no q-search to speak of.  Since the
>>>>>>q-search is at _least_ 50% of the total tree search space, lopping that off
>>>>>>gets more depth.  But how to compare 14 plies with no q-search to 12 plies
>>>>>>with q-search?
>>>>>>
>>>>>>3.  lots of selective search extensions.  This program might only search
>>>>>>9 plies deep on average, but it extends the _right_ moves at the right times,
>>>>>>so that even though it is only searching 9 plies deep, it beats the "22-ply
>>>>>>searching Junior program" handily.
>>>>>>
>>>>>>4.  Lots of other variations.  The bottom line is that depth is not an easy
>>>>>>way to compare programs.  Neither is NPS.  Unless you see some _real_ depth
>>>>>>that is way beyond everyone.  Or some real NPS that is way beyond everyone.
>>>>>>
>>>>>>For example, we have had a couple of very fast/dumb programs compete over
>>>>>>the years, and they have managed to do very well, because their speed and
>>>>>>tactics overcame their lack of positional understanding, when playing the
>>>>>>opponents they drew in the ACM/WCCC events.  We have also seen very slow
>>>>>>programs out-play everyone.  But we are talking about programs that are
>>>>>>generally within an order of magnitude of each other.  Say 20K nodes per
>>>>>>second to 200K nodes per second.  If someone suddenly hits the scene going
>>>>>>200M nodes per second, then that is a serious number if it is real...  So
>>>>>>even though I generally say that comparing NPS is a bad idea unless you are
>>>>>>using the _same_ program, there are logical exceptions...
>>>>>>
>>>>>>>
>>>>>>>But the claim should be illustrated by somewhat convincing figures (node rate is
>>>>>>>not convincing enough IMO, although still impressive). Maybe the ply depth is; I
>>>>>>>know it's also no perfect comparison though. But we don't have anything better
>>>>>>>probably. A few positons/moves to compare are not enough.
>>>>>>
>>>>>>I think you have to look at results above all else.  IE for IBM, deep thought
>>>>>>totally dominated computer chess for 10 years, losing one well-known game.  That
>>>>>>is tough to do if you are not far better than everyone else.  Since their last
>>>>>>computer event in 1995, suddenly they started going 100X faster.  So they have
>>>>>>a significant boost there, unless you do as some do and conclude that the
>>>>>>extra speed means nothing.
>>>>>
>>>>>I conclude that it was not 100 times fasters.
>>>>>
>>>>>1)200M nodes is wrong based on the paper of Hsu.
>>>>>2)They suffered from lack of efficiency because they prefered
>>>>>to improve the evaluation and not to fix
>>>>>the efficiency problems.
>>>>>
>>>>>I will not be surprised if their nodes were eqvivalent only
>>>>>to 20M on a single PC that is also very good achievement.
>>>>>
>>>>>I also believe that they were better than the programs
>>>>>of 1997 even if you use the hardware of today.
>>>>>
>>>>>Uri
>>>>
>>>>
>>>>I don't believe they were only equivalent to 20M nodes.  Simply because I
>>>>know how strong deep thought was from first-hand experience.  But I don't
>>>>have access to the machine to do the same kind of testing I can do with
>>>>Crafty.  I _know_ how much faster I run on my quad than I do on a single
>>>>cpu.  And _anybody_ can measure that if they have a quad handy since the
>>>>source for crafty is available.
>>>>
>>>>Unfortunately, we don't have that luxury with DB2.  But I find it very
>>>>difficult to believe that it was only a 20M machine effectively...
>>>>particularly considering that Hsu said more than once that he was driving
>>>>the chess processors at 70% duty cycle...
>>>
>>>If you look in the paper their reported speedups were extrapolated.
>>>So they measured what 1 cpu did and compared with a few processors,
>>>then used that number for 480 processors instead of measuring 480.
>>
>>Vincent, this is something to do with _that_ paper.  IE it should be
>>pretty obvious why they had to extrapolate at all.  All they have is
>>DB Jr to work with.
>>
>>Hsu did _lots_ of testing on the real DB machines when he had time.  And
>>he did _real_ speedup testing just like we do.  Don't confuse what was
>>in _that_ paper and assume that is _all_ they did.  It wasn't...
>>
>>I've seen some speedup stuff for DB1 in fact.  I saw a couple of test
>>positions where DB1 ran about 25 times faster with 200+ processors than
>>it did with just one.  I saw a couple of others where it was more like
>>50...  That isn't great, but it is _not_ "bad".  He gave me a number
>>of 30% way back, which I have quoted before.  IE with 200 processors
>>he said that 30% of that was a good estimate...  That was a number he
>>also mentioned in his dissertation...
>
>I do not know what he told you but I read an esimate for efficiency of 8-12% for
>Deeper Blue.
>
>>
>>Most of us would _not_ be happy with 30%.  IE I am not really happy
>>with my current 70%+ numbers, since Cray Blitz could do significantly
>>better with four processors.  However, 30% is not a bad result when you
>>go to large numbers of processors... and perhaps I might be happy with
>>30% once I get to the 480 processor level, although I have not seen
>>anything that said DB2 stayed at 30% since it had 2x more processors.
>
>If we assume 30% as correct for DB1 then common sense says that it is possible
>that he got less than 30% for more processors.
>
>The number of % go down when you have more processors and the deep blue team
>considered the evaluation as more important so they did not care about
>increasing the %.

You can't put that much knowledge in 66000 gates in 0.60 micron in 1997.

also we see that the 66000 gates logical number when we see what gets
written in the paper about what is inside it. they clearly had
mobility and such. Then some king safety patterns and some endgame
tuning and you have 66000 gats. Of course 64 bonuses/penalties each
pattern.

>Note that I do not assume that the 30% is correct for DB1 because if I
>understand correctly it seems to contradict the 8-12% efficiency for Deeper Blue
>because doubling the number of proccesors should not reduce the efficiency by a
>factor of more than 2.
>Uri

That's not a correct assumption Uri. First of all we talk about 30
software processors which are supposed to keep busy 480 hardware processors.

That is NOT easy.

Looking whether a cpu times out and keeping it busy.

The focus was of course getting more nodes a second. That's what boss
IBM wanted, that's what Hsu did for him. Even today no one got 126 MLN
nodes a second.






This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.