Author: Robert Hyatt
Date: 08:22:08 08/25/02
Go up one level in this thread
On August 25, 2002 at 01:40:30, Uri Blass wrote: >On August 25, 2002 at 00:41:55, Robert Hyatt wrote: > >>On August 24, 2002 at 18:31:07, Uri Blass wrote: >> >>>On August 24, 2002 at 17:59:43, Robert Hyatt wrote: >>> >>>>On August 23, 2002 at 12:24:45, Vincent Diepeveen wrote: >>>> >>>>>On August 22, 2002 at 20:25:24, Robert Hyatt wrote: >>>>> >>>>>>On August 22, 2002 at 18:22:56, Uri Blass wrote: >>>>>> >>>>>>>On August 22, 2002 at 18:01:09, Robert Hyatt wrote: >>>>>>> >>>>>>>>On August 21, 2002 at 20:10:26, Mike S. wrote: >>>>>>>> >>>>>>>>>On August 21, 2002 at 11:07:58, Robert Hyatt wrote: >>>>>>>>> >>>>>>>>>>(...) >>>>>>>>>>1. They reported depth as 11(6) for example. According to the deep blue >>>>>>>>>>team, and regardless of what others will say about it, this supposedly means >>>>>>>>>>that they did 11 plies in software, plus another 6 in hardware. >>>>>>>>> >>>>>>>>>When I looked at some of the logs, I had the impression that "11(6)" was >>>>>>>>>reported most often, IOW. we can probably say that it was the *typical* search >>>>>>>>>depth reported (except additional extension depths we do not know), in the >>>>>>>>>middlegame, 1997. Would you agree with that from your study of the logs? >>>>>>>>> >>>>>>>> >>>>>>>>I thought so. But since the paper quotes 12.2, that would mandate that 12 >>>>>>>>must come up more often that 11. I haven't gone thru each log in that kind >>>>>>>>of detail as that is a recipe for a headache. :) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>Another thing I'm not sure of is: *When* could relatively safely be claimed, >>>>>>>>>that DB.'s depth is reached again: >>>>>>>>> >>>>>>>>>a) when a current prog reaches at least 16 plies as a typical middlegame depth, >>>>>>>>> because some search techniques used now (which DB. didn't use), make up for >>>>>>>>> the missing ply (at least), or >>>>>>>>>b) when 17 plies are reached, not earlier, or >>>>>>>>>c) a program would have to reach more than 17 plies, because DB used much more >>>>>>>>> knowledge which current software probably does not yet use to that extent. >>>>>>>>> >>>>>>>>>I search for expert's opinions of *when* we can say something like "Yes, now >>>>>>>>>with this specific performance [## plies etc.] we can safely say - as it's our >>>>>>>>>*best guess*, since no direct head-to-head match is possible - that this new >>>>>>>>>chess computer is better than Deep Blue was." >>>>>>>> >>>>>>>>I don't see any real way to do this. IE take the following types of >>>>>>>>programs and try to compare depths: >>>>>>>> >>>>>>>>1. Junior, which uses a different definition of ply than everyone else. >>>>>>>>They appear to search _much_ deeper than anyone else, based only on this, >>>>>>>>but Amir has explained how he counts plies, and the bottom line is that >>>>>>>>raw ply depth can't be compared. >>>>>>>> >>>>>>>>2. Very dumb and fast program, with no q-search to speak of. Since the >>>>>>>>q-search is at _least_ 50% of the total tree search space, lopping that off >>>>>>>>gets more depth. But how to compare 14 plies with no q-search to 12 plies >>>>>>>>with q-search? >>>>>>>> >>>>>>>>3. lots of selective search extensions. This program might only search >>>>>>>>9 plies deep on average, but it extends the _right_ moves at the right times, >>>>>>>>so that even though it is only searching 9 plies deep, it beats the "22-ply >>>>>>>>searching Junior program" handily. >>>>>>>> >>>>>>>>4. Lots of other variations. The bottom line is that depth is not an easy >>>>>>>>way to compare programs. Neither is NPS. Unless you see some _real_ depth >>>>>>>>that is way beyond everyone. Or some real NPS that is way beyond everyone. >>>>>>>> >>>>>>>>For example, we have had a couple of very fast/dumb programs compete over >>>>>>>>the years, and they have managed to do very well, because their speed and >>>>>>>>tactics overcame their lack of positional understanding, when playing the >>>>>>>>opponents they drew in the ACM/WCCC events. We have also seen very slow >>>>>>>>programs out-play everyone. But we are talking about programs that are >>>>>>>>generally within an order of magnitude of each other. Say 20K nodes per >>>>>>>>second to 200K nodes per second. If someone suddenly hits the scene going >>>>>>>>200M nodes per second, then that is a serious number if it is real... So >>>>>>>>even though I generally say that comparing NPS is a bad idea unless you are >>>>>>>>using the _same_ program, there are logical exceptions... >>>>>>>> >>>>>>>>> >>>>>>>>>But the claim should be illustrated by somewhat convincing figures (node rate is >>>>>>>>>not convincing enough IMO, although still impressive). Maybe the ply depth is; I >>>>>>>>>know it's also no perfect comparison though. But we don't have anything better >>>>>>>>>probably. A few positons/moves to compare are not enough. >>>>>>>> >>>>>>>>I think you have to look at results above all else. IE for IBM, deep thought >>>>>>>>totally dominated computer chess for 10 years, losing one well-known game. That >>>>>>>>is tough to do if you are not far better than everyone else. Since their last >>>>>>>>computer event in 1995, suddenly they started going 100X faster. So they have >>>>>>>>a significant boost there, unless you do as some do and conclude that the >>>>>>>>extra speed means nothing. >>>>>>> >>>>>>>I conclude that it was not 100 times fasters. >>>>>>> >>>>>>>1)200M nodes is wrong based on the paper of Hsu. >>>>>>>2)They suffered from lack of efficiency because they prefered >>>>>>>to improve the evaluation and not to fix >>>>>>>the efficiency problems. >>>>>>> >>>>>>>I will not be surprised if their nodes were eqvivalent only >>>>>>>to 20M on a single PC that is also very good achievement. >>>>>>> >>>>>>>I also believe that they were better than the programs >>>>>>>of 1997 even if you use the hardware of today. >>>>>>> >>>>>>>Uri >>>>>> >>>>>> >>>>>>I don't believe they were only equivalent to 20M nodes. Simply because I >>>>>>know how strong deep thought was from first-hand experience. But I don't >>>>>>have access to the machine to do the same kind of testing I can do with >>>>>>Crafty. I _know_ how much faster I run on my quad than I do on a single >>>>>>cpu. And _anybody_ can measure that if they have a quad handy since the >>>>>>source for crafty is available. >>>>>> >>>>>>Unfortunately, we don't have that luxury with DB2. But I find it very >>>>>>difficult to believe that it was only a 20M machine effectively... >>>>>>particularly considering that Hsu said more than once that he was driving >>>>>>the chess processors at 70% duty cycle... >>>>> >>>>>If you look in the paper their reported speedups were extrapolated. >>>>>So they measured what 1 cpu did and compared with a few processors, >>>>>then used that number for 480 processors instead of measuring 480. >>>> >>>>Vincent, this is something to do with _that_ paper. IE it should be >>>>pretty obvious why they had to extrapolate at all. All they have is >>>>DB Jr to work with. >>>> >>>>Hsu did _lots_ of testing on the real DB machines when he had time. And >>>>he did _real_ speedup testing just like we do. Don't confuse what was >>>>in _that_ paper and assume that is _all_ they did. It wasn't... >>>> >>>>I've seen some speedup stuff for DB1 in fact. I saw a couple of test >>>>positions where DB1 ran about 25 times faster with 200+ processors than >>>>it did with just one. I saw a couple of others where it was more like >>>>50... That isn't great, but it is _not_ "bad". He gave me a number >>>>of 30% way back, which I have quoted before. IE with 200 processors >>>>he said that 30% of that was a good estimate... That was a number he >>>>also mentioned in his dissertation... >>> >>>I do not know what he told you but I read an esimate for efficiency of 8-12% for >>>Deeper Blue. >> >> >>I think I posted that sometime back. The 30% was (as best I recall) for deep >>blue 1. DB2 had a much different hardware search, where they added singular >>extensions among other things. That would likely break the search load >>balancing to an extent. >> >>But even if it is 10%, that is not bad. Nor 12%. IE the gross peak speed >>was > one billion nodes per second. 12% would be 120M nodes per second, and >>that includes the SMP loss. 120M is a terrific speed to search. For example, >>my 1.5M on the quad is more like 1.1M due to smp search overhead. >> >>I think we have lots of different numbers being mixed up carelessly by the >>DB team. >> >>1. NPS. The raw NPS searched in a game, such as the number reported by Crafty, >>and the deep * programs. Unfortunately, these are inflated numbers because many >>of those nodes would not be searched by a serial search and are therefore >>worthless. >> >>2. NPS. The effective NPS searched in a game. For example, I could easily >>calculate this for Crafty as follows: Do a search to some target depth and >>display the nodes needed. Then use 4 processors to see how long it now takes >>to get to that same target depth. Say the answer for position X is 2.0. Then >>the NPS for this position should really be displayed as 2.0 * serial NPS since >>the search is effectively twice as fast. For position Y, it is 3.1X faster, >>so the NPS should be 3.1 * serial NPS for this position. >> >>In many cases, Hsu used number 2. While the rest of us are blindly using >>#1 because it is far easier to calculate this in a chess engine, since we >>really have no idea how long a serial search would take without re-running the >>game with one cpu to find out. So we display NPS #1 and go about our business. >> >>HSU doesn't know the _real_ nps like we do, because the chess processors can't >>count then. I asked Ken why once and he said (for Belle, which was the >>predecessor of a single DB chip) "it takes as long to count a node as it does >>to search it..." > >I do not understand it. >Counting nodes is clearly easier than searching them and I see no reason that it >is going to be different for deeper blue. You don't understand it because you aren't a hardware designer. And that is perfectly ok. How long does it take a traditional microprocessor to execute an add operation? That involves a fetch from memory, an increment, and a memory store. Deep Blue (and Belle and Bebe and other hardware machines) can search a node in that same length of time... because it is being done in hardware. That is the issue. Searching a node doesn't require a memory access for starters. Nor does it require an adder tree to add a 64 bit number... that takes hardware space. and time. > >If the hardware can give information to the software about search I do not see >what is the problem also to give information about number of nodes. All the hardware gives to the software is the best move and the score. Passing back a node count would add another cycle and cost at least 50% overhead in doing so. Remember that I have mentioned before that DB chess chips can't give a PV. That is the reason why. > >I think that not knowing the number of nodes is a mistake because the number of >nodes can be used for better order of moves or for better extension rules. Not in hardware. You are stuck with what you have... When you _simulate_ the chip, yes. But when you commit it to silicon, it is too late to think about improvements then. > > SO to figure out NPS, they have to do a lot of work to measure >>how long each chess processor is busy, multiply that by the known hardware >>speed, and then sum that for all chess processors. Hsu frequently used a >>"real (#2) NPS" because it was more honest. > >I think that it is better simply to give both numbers to avoid confusion. >If somebody writes Nps and not effective Nps then I think that the meaning is >simply the number of nodes. I agree. Hsu started this with his dissertation. And it was confusing. > >Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.