Author: Robert Hyatt
Date: 17:25:24 08/22/02
Go up one level in this thread
On August 22, 2002 at 18:22:56, Uri Blass wrote: >On August 22, 2002 at 18:01:09, Robert Hyatt wrote: > >>On August 21, 2002 at 20:10:26, Mike S. wrote: >> >>>On August 21, 2002 at 11:07:58, Robert Hyatt wrote: >>> >>>>(...) >>>>1. They reported depth as 11(6) for example. According to the deep blue >>>>team, and regardless of what others will say about it, this supposedly means >>>>that they did 11 plies in software, plus another 6 in hardware. >>> >>>When I looked at some of the logs, I had the impression that "11(6)" was >>>reported most often, IOW. we can probably say that it was the *typical* search >>>depth reported (except additional extension depths we do not know), in the >>>middlegame, 1997. Would you agree with that from your study of the logs? >>> >> >>I thought so. But since the paper quotes 12.2, that would mandate that 12 >>must come up more often that 11. I haven't gone thru each log in that kind >>of detail as that is a recipe for a headache. :) >> >> >> >>>Another thing I'm not sure of is: *When* could relatively safely be claimed, >>>that DB.'s depth is reached again: >>> >>>a) when a current prog reaches at least 16 plies as a typical middlegame depth, >>> because some search techniques used now (which DB. didn't use), make up for >>> the missing ply (at least), or >>>b) when 17 plies are reached, not earlier, or >>>c) a program would have to reach more than 17 plies, because DB used much more >>> knowledge which current software probably does not yet use to that extent. >>> >>>I search for expert's opinions of *when* we can say something like "Yes, now >>>with this specific performance [## plies etc.] we can safely say - as it's our >>>*best guess*, since no direct head-to-head match is possible - that this new >>>chess computer is better than Deep Blue was." >> >>I don't see any real way to do this. IE take the following types of >>programs and try to compare depths: >> >>1. Junior, which uses a different definition of ply than everyone else. >>They appear to search _much_ deeper than anyone else, based only on this, >>but Amir has explained how he counts plies, and the bottom line is that >>raw ply depth can't be compared. >> >>2. Very dumb and fast program, with no q-search to speak of. Since the >>q-search is at _least_ 50% of the total tree search space, lopping that off >>gets more depth. But how to compare 14 plies with no q-search to 12 plies >>with q-search? >> >>3. lots of selective search extensions. This program might only search >>9 plies deep on average, but it extends the _right_ moves at the right times, >>so that even though it is only searching 9 plies deep, it beats the "22-ply >>searching Junior program" handily. >> >>4. Lots of other variations. The bottom line is that depth is not an easy >>way to compare programs. Neither is NPS. Unless you see some _real_ depth >>that is way beyond everyone. Or some real NPS that is way beyond everyone. >> >>For example, we have had a couple of very fast/dumb programs compete over >>the years, and they have managed to do very well, because their speed and >>tactics overcame their lack of positional understanding, when playing the >>opponents they drew in the ACM/WCCC events. We have also seen very slow >>programs out-play everyone. But we are talking about programs that are >>generally within an order of magnitude of each other. Say 20K nodes per >>second to 200K nodes per second. If someone suddenly hits the scene going >>200M nodes per second, then that is a serious number if it is real... So >>even though I generally say that comparing NPS is a bad idea unless you are >>using the _same_ program, there are logical exceptions... >> >>> >>>But the claim should be illustrated by somewhat convincing figures (node rate is >>>not convincing enough IMO, although still impressive). Maybe the ply depth is; I >>>know it's also no perfect comparison though. But we don't have anything better >>>probably. A few positons/moves to compare are not enough. >> >>I think you have to look at results above all else. IE for IBM, deep thought >>totally dominated computer chess for 10 years, losing one well-known game. That >>is tough to do if you are not far better than everyone else. Since their last >>computer event in 1995, suddenly they started going 100X faster. So they have >>a significant boost there, unless you do as some do and conclude that the >>extra speed means nothing. > >I conclude that it was not 100 times fasters. > >1)200M nodes is wrong based on the paper of Hsu. >2)They suffered from lack of efficiency because they prefered >to improve the evaluation and not to fix >the efficiency problems. > >I will not be surprised if their nodes were eqvivalent only >to 20M on a single PC that is also very good achievement. > >I also believe that they were better than the programs >of 1997 even if you use the hardware of today. > >Uri I don't believe they were only equivalent to 20M nodes. Simply because I know how strong deep thought was from first-hand experience. But I don't have access to the machine to do the same kind of testing I can do with Crafty. I _know_ how much faster I run on my quad than I do on a single cpu. And _anybody_ can measure that if they have a quad handy since the source for crafty is available. Unfortunately, we don't have that luxury with DB2. But I find it very difficult to believe that it was only a 20M machine effectively... particularly considering that Hsu said more than once that he was driving the chess processors at 70% duty cycle...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.