Author: Uri Blass
Date: 15:31:07 08/24/02
Go up one level in this thread
On August 24, 2002 at 17:59:43, Robert Hyatt wrote: >On August 23, 2002 at 12:24:45, Vincent Diepeveen wrote: > >>On August 22, 2002 at 20:25:24, Robert Hyatt wrote: >> >>>On August 22, 2002 at 18:22:56, Uri Blass wrote: >>> >>>>On August 22, 2002 at 18:01:09, Robert Hyatt wrote: >>>> >>>>>On August 21, 2002 at 20:10:26, Mike S. wrote: >>>>> >>>>>>On August 21, 2002 at 11:07:58, Robert Hyatt wrote: >>>>>> >>>>>>>(...) >>>>>>>1. They reported depth as 11(6) for example. According to the deep blue >>>>>>>team, and regardless of what others will say about it, this supposedly means >>>>>>>that they did 11 plies in software, plus another 6 in hardware. >>>>>> >>>>>>When I looked at some of the logs, I had the impression that "11(6)" was >>>>>>reported most often, IOW. we can probably say that it was the *typical* search >>>>>>depth reported (except additional extension depths we do not know), in the >>>>>>middlegame, 1997. Would you agree with that from your study of the logs? >>>>>> >>>>> >>>>>I thought so. But since the paper quotes 12.2, that would mandate that 12 >>>>>must come up more often that 11. I haven't gone thru each log in that kind >>>>>of detail as that is a recipe for a headache. :) >>>>> >>>>> >>>>> >>>>>>Another thing I'm not sure of is: *When* could relatively safely be claimed, >>>>>>that DB.'s depth is reached again: >>>>>> >>>>>>a) when a current prog reaches at least 16 plies as a typical middlegame depth, >>>>>> because some search techniques used now (which DB. didn't use), make up for >>>>>> the missing ply (at least), or >>>>>>b) when 17 plies are reached, not earlier, or >>>>>>c) a program would have to reach more than 17 plies, because DB used much more >>>>>> knowledge which current software probably does not yet use to that extent. >>>>>> >>>>>>I search for expert's opinions of *when* we can say something like "Yes, now >>>>>>with this specific performance [## plies etc.] we can safely say - as it's our >>>>>>*best guess*, since no direct head-to-head match is possible - that this new >>>>>>chess computer is better than Deep Blue was." >>>>> >>>>>I don't see any real way to do this. IE take the following types of >>>>>programs and try to compare depths: >>>>> >>>>>1. Junior, which uses a different definition of ply than everyone else. >>>>>They appear to search _much_ deeper than anyone else, based only on this, >>>>>but Amir has explained how he counts plies, and the bottom line is that >>>>>raw ply depth can't be compared. >>>>> >>>>>2. Very dumb and fast program, with no q-search to speak of. Since the >>>>>q-search is at _least_ 50% of the total tree search space, lopping that off >>>>>gets more depth. But how to compare 14 plies with no q-search to 12 plies >>>>>with q-search? >>>>> >>>>>3. lots of selective search extensions. This program might only search >>>>>9 plies deep on average, but it extends the _right_ moves at the right times, >>>>>so that even though it is only searching 9 plies deep, it beats the "22-ply >>>>>searching Junior program" handily. >>>>> >>>>>4. Lots of other variations. The bottom line is that depth is not an easy >>>>>way to compare programs. Neither is NPS. Unless you see some _real_ depth >>>>>that is way beyond everyone. Or some real NPS that is way beyond everyone. >>>>> >>>>>For example, we have had a couple of very fast/dumb programs compete over >>>>>the years, and they have managed to do very well, because their speed and >>>>>tactics overcame their lack of positional understanding, when playing the >>>>>opponents they drew in the ACM/WCCC events. We have also seen very slow >>>>>programs out-play everyone. But we are talking about programs that are >>>>>generally within an order of magnitude of each other. Say 20K nodes per >>>>>second to 200K nodes per second. If someone suddenly hits the scene going >>>>>200M nodes per second, then that is a serious number if it is real... So >>>>>even though I generally say that comparing NPS is a bad idea unless you are >>>>>using the _same_ program, there are logical exceptions... >>>>> >>>>>> >>>>>>But the claim should be illustrated by somewhat convincing figures (node rate is >>>>>>not convincing enough IMO, although still impressive). Maybe the ply depth is; I >>>>>>know it's also no perfect comparison though. But we don't have anything better >>>>>>probably. A few positons/moves to compare are not enough. >>>>> >>>>>I think you have to look at results above all else. IE for IBM, deep thought >>>>>totally dominated computer chess for 10 years, losing one well-known game. That >>>>>is tough to do if you are not far better than everyone else. Since their last >>>>>computer event in 1995, suddenly they started going 100X faster. So they have >>>>>a significant boost there, unless you do as some do and conclude that the >>>>>extra speed means nothing. >>>> >>>>I conclude that it was not 100 times fasters. >>>> >>>>1)200M nodes is wrong based on the paper of Hsu. >>>>2)They suffered from lack of efficiency because they prefered >>>>to improve the evaluation and not to fix >>>>the efficiency problems. >>>> >>>>I will not be surprised if their nodes were eqvivalent only >>>>to 20M on a single PC that is also very good achievement. >>>> >>>>I also believe that they were better than the programs >>>>of 1997 even if you use the hardware of today. >>>> >>>>Uri >>> >>> >>>I don't believe they were only equivalent to 20M nodes. Simply because I >>>know how strong deep thought was from first-hand experience. But I don't >>>have access to the machine to do the same kind of testing I can do with >>>Crafty. I _know_ how much faster I run on my quad than I do on a single >>>cpu. And _anybody_ can measure that if they have a quad handy since the >>>source for crafty is available. >>> >>>Unfortunately, we don't have that luxury with DB2. But I find it very >>>difficult to believe that it was only a 20M machine effectively... >>>particularly considering that Hsu said more than once that he was driving >>>the chess processors at 70% duty cycle... >> >>If you look in the paper their reported speedups were extrapolated. >>So they measured what 1 cpu did and compared with a few processors, >>then used that number for 480 processors instead of measuring 480. > >Vincent, this is something to do with _that_ paper. IE it should be >pretty obvious why they had to extrapolate at all. All they have is >DB Jr to work with. > >Hsu did _lots_ of testing on the real DB machines when he had time. And >he did _real_ speedup testing just like we do. Don't confuse what was >in _that_ paper and assume that is _all_ they did. It wasn't... > >I've seen some speedup stuff for DB1 in fact. I saw a couple of test >positions where DB1 ran about 25 times faster with 200+ processors than >it did with just one. I saw a couple of others where it was more like >50... That isn't great, but it is _not_ "bad". He gave me a number >of 30% way back, which I have quoted before. IE with 200 processors >he said that 30% of that was a good estimate... That was a number he >also mentioned in his dissertation... I do not know what he told you but I read an esimate for efficiency of 8-12% for Deeper Blue. > >Most of us would _not_ be happy with 30%. IE I am not really happy >with my current 70%+ numbers, since Cray Blitz could do significantly >better with four processors. However, 30% is not a bad result when you >go to large numbers of processors... and perhaps I might be happy with >30% once I get to the 480 processor level, although I have not seen >anything that said DB2 stayed at 30% since it had 2x more processors. If we assume 30% as correct for DB1 then common sense says that it is possible that he got less than 30% for more processors. The number of % go down when you have more processors and the deep blue team considered the evaluation as more important so they did not care about increasing the %. Note that I do not assume that the 30% is correct for DB1 because if I understand correctly it seems to contradict the 8-12% efficiency for Deeper Blue because doubling the number of proccesors should not reduce the efficiency by a factor of more than 2. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.