Author: Robert Hyatt
Date: 19:52:41 08/27/01
Go up one level in this thread
On August 27, 2001 at 19:40:42, Tom Kerrigan wrote: > >Exactly. I've heard you say over and over that DB is vastly different from DT. >And the code that's on Tim Mann's page is for DT. And it's a program for doing >automatic tuning against GM games anyway, not the kind of tuning that was >reportedly done for DB. Is it safe to assume that, because this is the best code >you can produce, that you don't have _any_ actual DB-related code? And because >you have to guess at the speed of DB code based on CB speeds, that you don't >know _any_ specifics of the code they used? If that's the case, and it seems >like it is, I don't see what business you have making the guesses you've been >making and passing them off as informed estimates. Nope. The DB guys have reported that they used the same approach. Obviously they had to add code to match what DB1 and then DB2 could evaluate, but I assumed that was pretty obvious. I see no reason that anybody would take any great pains to optimize something that is not time-critical in any way, when they had so much other work to do for the match. Some "informed guesses" plus direct information from some of the team members gives pretty good info. I'm not the _only_ one that has talked to them, asked them questions, then reported the results here... > >>There is no "knee-jerk". Hsu says "XXX". You say "I don't believe XXX". >There >>is little to justify that when _you_ don't _know_. > >I said "I don't believe this" to the idea that a software implementation of DB >would be "so slow as to be worthless." When did Hsu say that a software >implementation of DB would be so slow as to be worthless? In fact, when did Hsu >say anything? I did some web searching and all I could find of his was some open >letters about unrelated issues and an early paper on DB, with the estimate that >a general purpose CPU would have to run at 10k MIPS to do what the DB chip does. >Well, CPUs aren't THAT far away from 10k MIPS these days, so if you want to read >anything into Hsu's words, it seems like he's siding with me. Today's cpus _are_ "that far away" from sustained 10 BIPS. Which is what he said might be enough. "might be". Because in some respects he is just like me... He hasn't given a lot of thought to how he would do something in a software program that he is currently doing in hardware. My vectorized attack detection code used 2% of the execution cycles in Cray Blitz. On the PC this balooned to 25%, even after we rewrote it to better suit the PC. Had you asked me how it would run prior to that, I would not have thought that a few dozen clock cycles on a Cray would turn into a few thousand cycles on a PC. Because doing that never occurred to me until my chess partner wanted to know "Hey, I found a good FORTRAN compiler for the 386... can you compile the pure FORTRAN version of CB so I can run it at home?" He found it useless due to the incredibly slow speed. > >(BTW, if you're interested, the same paper says that the DB chip took three >years to create. This is a far cry from the 9 months that you stated in another >post.) You are reading the wrong stuff. The _first_ DB chip took maybe 3 years, and if you had read everything he wrote, and attended a lecture or two, you would know why. There were some interesting problems he had to overcome that had nothing to do with chess. Pads on the chip were too large. Cross-coupling between signal lines on the chips that was unexpected and required some cute hardware work-arounds. Complete batches of chips that were botched for various reasons. All you have to do is ask him... DB 2 was _definitely_ done in 9 months from concept to production. His book will tell the story if/when it is published. > >>>You may think the cost is too high, but I know for a fact that there are a ton >>>of extremely strong programs out there that have these terms. >> >>Name that "ton". I've seen Rebel play. It doesn't. I have seen most every >>micro play, and fall victim to attacks that say "I don't understand how all >>those pieces are attacking my king-side..." > >I won't name the programs because I don't know if the authors would want me to. >And I wasn't thinking of Rebel. > >>What is there to understand? A potentially open file is a very concrete >>thing, just like an open file or a half-open file is. No confusing definitions. >>No multiple meanings. > >Okay, so what is it? Is it one with a pawn lever? Or one without a pawn ram? >Seems like both of those could be considered potentially open files, and they >aren't exactly expensive to evaluate. Says the man that hasn't evaluated them yet. :) You have to see if the pawn can advance to the point it can make contact with an enemy pawn without getting lost. It is definitely non-trivial. From the man that _does_ evaluate them now. > >>Not "difficult to do". I believe I said "impossibly slow". There _is_ a >>difference. Everything they do in parallel, you would get to do serially. >>All the special-purpose things they do in a circuit, you get to use lots of >>code to emulate. I estimated a slow-down of 1M. I don't think I would change >>this. Cray Blitz lost a factor of 7,000 from a Cray to a PC of the same >>time period. Solely because of vectors and memory bandwidth. Crafty on a cray >>gets population count, leading zeros, all for free. Because there are special- >>purpose instructions to do these quickly. DB was full of those sorts of >>special-purpose gates. > >No, you're completely confusing the entire issue. Was DB written in Fortran, or >Cray assembly? Did it run on a Cray? Does it have anything to do with a Cray? >Does it even implement the same evaluation function? How about the same search? >There are enough variables in your "estimation" here to make any legitimate >scientist puke. Only those that haven't done this. DB was written in C. Plus microcode for the chess processors (first version). Plus evaluation tables. The issues are the same. Porting a program from one environment (hardware or vector in my case) to another (software or non-vector in my case) presents huge performance problems. And if the end-result is not important, the "port" will be sloppy because the goal is to get it done, quickly, period. Not to make it efficient. > >>>You've spent years building up DB's evaluation function. Surely you can see some >>>benefits (even aside from commercial) of having this thing run on widely >>>available hardware. >> >>at 1/1,000,000th the speed of the real mccoy? Again, what would one learn from >>such a thing? What could I learn from working with a 1nps version of Crafty, >>when it is going to run at 1M nodes per second when I get ready to play a real >>game? > >Again, assuming your 1M figure is anywhere near accurate. You're claiming that a >DB node is worth about five thousand (5,000) (!!) "regular" PC program nodes. >What on EARTH can POSSIBLY take 5,000 nodes worth of computation to figure out? >You're going to have to do way better than your lame "potentially open file" >thing to sell that to anyone. I'm not saying any such thing. I simply said that they do a _bunch_ of things in their eval, in parallel. Not to mention the mundane parts like maintaining the chess board. I consider their raw NPS to be 200X more than a traditional micro of today. I consider their effective NPS to be 5x more than that, based on the eval things they can do for nothing that we don't do because of the costs. That's all I have said, although I _have_ said it often. You are trying to mix up the emulation of their evaluation, which I say would be hugely slow on today's PCs. So, to be clear, the Hardware they had was quite good. And sort of software emulation would be highly ugly. Because things done in hardware often don't translate "nicely" into software. The special-purpose bit counting/finding instructions on the cray are well-known examples that take a clock cycle on the cray, but take dozens of clock cycles on a PC. I don't know how to explain it any better. Until you have done it, you might simply be unable to understand it. I'm not going to keep going over it however. > >>We know how DB (single-chip) did when slowed to 1/10th its nominal speed >>and played against top commercial programs. That was reported by me first, >>then others asked about it at lectures by the DB team and we got even more >>information from those reports. > >No, we don't "know" that. Where are the reports? Where are the game scores? Someone here can give more information. I reported on the first 10 game match. We later found out there were 40 games. Someone _else_ found this out at a lecture by Campbell. Since he said it, I feel confident that it happened. There are _some_ people that can be trusted to be honest. > >>I am _certain_ that taking DB from hardware to software would cost a lot. >>You would lose a factor of 480 because of the chess chips. You would lose >>a factor of 32 because of the SP. You would lose a factor of something due >>to the cost of doing Make/UnMake/Generate/Evaluate in software during the >>software part of the search, rather than getting to use the hardware they >>had to handle these mundane parts of the software search. 32 X 500 is over >>10,000 already. And it is only going to get worse. > >10k is a _really_ far cry from 1M. I simply stopped at 10K. my bit count instruction on the cray is a couple of clock cycles. On the PC it is about a hundred. That is another factor of 50. Now we are at 500K. I have no idea what problems they would encounter that would match the problems I found in trying to do some things on a PC that were trivial on the Cray. IE my mobility was murder on the PC. On the Cray, it was basically "free" (qualitative mobility as I explained it to Vincent a year ago). > Besides, if you think that DB's algorithms >are completely worthless if they aren't running on their fast hardware, why >doesn't that apply to any other PC program? Are they all worthless because they >don't search 200M NPS? Or because they can be run on slower PCs? Or because they >will be run on faster PCs in the future? What you're saying is basically, "why >have a chess program?" I'm surprised you haven't thought of any reasons by now. > I have absolutely no idea what you are rambling about. A chess engine designed to search on hardware that can do 1K nodes per second is a _far_ different chess engine than one designed to run on hardware that can search 1M nodes per second. Yes the 1K program will be better at 1M. But the 1M program will be far worse at 1K than a program designed for 1K. Which simply means that part of the design process factors in the speed of the search. Or at least good programs do. >>When your data is flawed, you need more. Crafty lost one game at a time >>handicap. Ed then played more games with crafty at the same time control, >>but with rebel at that time limit also. And the result was much different. >>Which suggests that the first (and only) handicap game was a fluke, which >>is certainly the most likely truth. > >Changing the experiment does not magically invalidate data. If you want to call >all of your losses "flukes," fine. One game is completely statistically invalid to predict _anything_. Which is why we originally settled on 10 games. If you can draw conclusions from one game, feel free. I can't. I would prefer 100 or 1000 to get some statistical significance. > >>I won't try to speculate why they reported 200M. Hsu was a scientist. With > >Why is there any need to speculate? I think I posted a perfectly legitimate >potential explanation for the number. There are probably more possible >explanations. Why in the world do you refuse to take his number at face value? > >-Tom I do. I have read everything he has written. He gave the speeds of the two batches of chess processors. He gave the total number. He gave the 70% duty cycle number. That comes to 700M. In yet another place (his Ph.D. thesis I believe) he claimed 20-30% search efficiency for his two-level parallel search. All of those numbers, taken together, could be used to derive the 200M number in several different ways. I suspect my conclusion is closest to the truth. He has reported depths of 12 plies. When pressed, he then responded with yes, we were doing 12 plies in software plus 5-7 in hardware. If no one thinks to ask about his numbers, and just take them at face value, the conclusions can be wrong...
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.