Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: (more, had to interrupt finishing previous post due to netscape prob

Author: Robert Hyatt
Date: 06:57:30 08/28/01
On August 28, 2001 at 09:50:37, Robert Hyatt wrote:

>On August 28, 2001 at 00:57:49, Tom Kerrigan wrote:
>
>>On August 27, 2001 at 22:52:41, Robert Hyatt wrote:
>>
>>>On August 27, 2001 at 19:40:42, Tom Kerrigan wrote:
>>>
>>>>
>>>>Exactly. I've heard you say over and over that DB is vastly different from DT.
>>>>And the code that's on Tim Mann's page is for DT. And it's a program for doing
>>>>automatic tuning against GM games anyway, not the kind of tuning that was
>>>>reportedly done for DB. Is it safe to assume that, because this is the best code
>>>>you can produce, that you don't have _any_ actual DB-related code? And because
>>>>you have to guess at the speed of DB code based on CB speeds, that you don't
>>>>know _any_ specifics of the code they used? If that's the case, and it seems
>>>>like it is, I don't see what business you have making the guesses you've been
>>>>making and passing them off as informed estimates.
>>>
>>>Nope.  The DB guys have reported that they used the same approach.  Obviously
>>
>>Then why was I reading so many stories in Newsweek and whatnot about how some GM
>>was playing DB constantly and the programmers were changing the function to
>>avoid moves that he didn't like? How could they be doing this and simultaneously
>>relying on the automatic tuner? Why have the GM there at all? And what was the
>>GM playing before the chips were fabbed, if not a software version of the
>>algorithms? Doesn't make any sense.
>
>Again, why don't you _ask_ the guys that did this.  They explained this at
>one ACM panel discussion we had.  They apparently wrote a "tool" that they
>could use to adjust the evaluation.  But the tool was designed so that they
>could take a test set of positions and use those to "control" the changes to
>the eval, so that the eval would still favor the right moves in those test
>positions even though a specific term had been modified after a GM had found
>a weakness.
>
>This is all well-known.  And automatically tuning an evaluation against a test
>set of positions doesn't mean they can't respond to a GM comment.  They simply
>had to take the new position, plus the suggested correct move, and add that to
>their test suite and then run the auto-tune code once again.
>
>Pretty simple, once you actually think about it rather than trying to find
>excuses for why they couldn't really do it.
>
>
>
>
>>
>>>Today's cpus _are_ "that far away" from sustained 10 BIPS.  Which is what he
>>>said might be enough.  "might be".  Because in some respects he is just like
>>
>>Nope. The paper doesn't say "might." And today's processors can sustain more
>>than one instruction per cycle, meaning that cheap, readily available computers
>>are pushing 2k MIPS. If Hsu is to be believed, that means a fast PC would be
>>twice as fast as the single DB chip that took apart all those micro programs.
>>Surely that's strong enough to be interesting?
>
>If someone took the time to take the DB source code, plus the schematics for
>the DB chess processors, and turned that into a pure C program, it might well
>be interesting.  I can say with lots of confidence however, that his original
>estimate is being twisted deeply by the way you are trying to interpret it.
>
>Do you _really_ think that a single CPU today can do 2.4 million nodes per
>second, doing _full_ endpoint evaluations, with an evaluation that they claim
>has over 8000 unique and different weights?  I don't.  It might be possible
>to get 240K nodes per second, with a _lot_ of work.  And by "lot" I mean
>man-years.  I wouldn't venture a guess as to how many man-years of effort went
>into the vectorized code in Cray Blitz.  And I certainly have no idea of how
>many years it would take to re-invent all those algorithms but in a form that
>would be reasonably fast on the PC. The Vector merge instruction did wonders
>for mobility calculations.  The PC has nothing similar, and it can't handle
>the memory bandwidth a simple software translation would require.
>
>I suspect that if Hsu tried to take the chess hardware and translate it all into
>C for a "production engine" that he would run into severe performance
>bottlenecks in some places.  And that he would have to spend significant amounts
>of time to find alternative approaches.  And he would probably scrap bits and
>pieces that are simply too costly to be able to bear on a PC.
>
>Back to the original idea...  This is not just impractical.  It is much worse.
>
>
>
>
>
>>
>>>me...  He hasn't given a lot of thought to how he would do something in a
>>>software program that he is currently doing in hardware.  My vectorized attack
>>
>>Oh, I see. You treat every single word that dribbles out of his mouth as gospel,
>>unless he says something you don't care for. Then he just "hasn't given a lot of
>>thought" to it.
>
>
>Nope.  I'm just technically acute enough to recognize what is fact, and what
>is speculation.  If he had said "I have spent several months looking at what it
>would take to make this work on a PC..." then I'd take that as an accurate
>estimate.  When Gower asked me to port CB to the PC, I guessed it would be
>maybe 100X slower, if that much.  I was _badly_ wrong.  By almost a factor
>of 100x.  So there are estimates, and there are guestimates.  Guestimates
>usually happen when you really don't care about how accurate they are because
>you have no plans to really do the work and find out.
>
>
>
>
>
>>
>>>>(BTW, if you're interested, the same paper says that the DB chip took three
>>>>years to create. This is a far cry from the 9 months that you stated in another
>>>>post.)
>>>
>>>You are reading the wrong stuff.  The _first_ DB chip took maybe 3 years,
>>>and if you had read everything he wrote, and attended a lecture or two, you
>>>would know why.  There were some interesting problems he had to overcome that
>>>had nothing to do with chess.  Pads on the chip were too large.  Cross-coupling
>>>between signal lines on the chips that was unexpected and required some cute
>>>hardware work-arounds.  Complete batches of chips that were botched for various
>>>reasons.
>>
>>I don't see what this has to do with anything. I said it takes months or years
>>to make a chip. You say no, it took months. Now you say a different chip took
>>years. Either way, I'm right. You're sitting there working yourself into a
>>lather, trying (and failing) to contradict something that's only tangential to
>>my point.
>
>
>My original statement was DB2 was designed, fabbed, tested, and used, all
>within roughly 9 months.  This is absolute fact and will make for interesting
>reading when his book comes out.  You then gave a quote about the original DB
>chip.  At that point in time, there was _no_ rush.  They were deciding what
>they wanted to do, and how they were going to do it.  The entire 3 years was
>not taken in the hardware design.  So I see no contradiction here at all,
>other than one you would like to imagine.
>
>
>
>
>>
>>>>Okay, so what is it? Is it one with a pawn lever? Or one without a pawn ram?
>>>>Seems like both of those could be considered potentially open files, and they
>>>>aren't exactly expensive to evaluate.
>>>
>>>Says the man that hasn't evaluated them yet.  :)
>>>
>>>You have to see if the pawn can advance to the point it can make contact
>>>with an enemy pawn without getting lost.  It is definitely non-trivial.
>>>From the man that _does_ evaluate them now.
>>
>>Bully for you. You have your idea of what a potentially open file is, and I have
>>other perfectly legitimate ideas. And you obviously don't have a clue what DB's
>>idea is, or you would have told us by now. And that means you don't have a clue
>>how expensive their term is. And that means I'm still not impressed.
>
>
>I _do_ know what they did.  In fact, I explained _exactly_ what a potentially
>open file is.  All you have to do is read it.  Or just read any good book on
>chess.  You'll find this discussed.  If you have "other perfectly legitimate
>ideas" about what a potentially open file is, charge full speed ahead.  And
>along the way re-define open and half-open to whatever you want as well.  But
>the terms _do_ have precise meaning.  The classic definition is "I have a
>potentially open file if I can force the file open whenever I want by moving
>pawns."  If my opponent can do that, then he has one too.  If either of us
>can do that, the file is potentially open for both of us.  Play some chess.
>You'll get it.
>
>
>
>
>>
>>>Only those that haven't done this.  DB was written in C.  Plus microcode for
>>>the chess processors (first version).  Plus evaluation tables.  The issues are
>>>the same.  Porting a program from one environment (hardware or vector in my
>>
>>If you want to think that porting your stupid population count function is in
>>any way even similar to porting DB's matricies of evaluation cells, fine. You're
>>flattering yourself.
>>
>
>It _is_ similar.  IF if Crafty were designed in hardware, I would not have to
>handle rotated bitmaps as I now do.  Because I could rotate the real bitmaps
>in hardware quite trivially.  And then when I tried to port that to software
>I would run into something that would be a _real_ problem.  A direct port
>would mean taking a standard bitmap and "rotating" it by moving each of the
>64 bits to a new position.  128 instructions on a 64 bit machine, nearly double
>that on a 32 bit machine.  And that is the kind of thing they would probably hit
>in several places, because hardware solutions are _different_ than software
>solutions.
>
>If you don't get that, there is little I can do about it other than to tell you
>to get some experience then come back when you are ready to talk about this in
>an informed way.  There are simply some things that are easy to do with
>special-purpose hardware, while these same things are incredibly inefficient in
>software.
>
>
>
>
>>>>Again, assuming your 1M figure is anywhere near accurate. You're claiming that a
>>>>DB node is worth about five thousand (5,000) (!!) "regular" PC program nodes.
>>>>What on EARTH can POSSIBLY take 5,000 nodes worth of computation to figure out?
>>>>You're going to have to do way better than your lame "potentially open file"
>>>>thing to sell that to anyone.
>>>
>>>I'm not saying any such thing.  I simply said that they do a _bunch_ of things
>>
>>Of course you're saying any such thing. You're saying DB would take a 1Mx perf
>>hit. It searches 200M NPS. Do the division, you get 200 NPS. PC programs search
>>1M NPS. Do more division. You get 5,000 PC program nodes searched per your
>>hypothetical DB node. Let's try this again, what on EARTH can POSSIBLY take
>>5,000 nodes worth of computation to figure out?
>>
>>>That's all I have said, although I _have_ said it often.  You are trying to
>>>mix up the emulation of their evaluation, which I say would be hugely slow
>>>on today's PCs.  So, to be clear, the Hardware they had was quite good.  And
>>>sort of software emulation would be highly ugly.  Because things done in
>>
>>I didn't bring up the idea of "emulation." I've been talking about
>>re-implementation of the algorithms. You don't have to emulate anything to do
>>that. You seem to think that software DB has to recognize a bishop pair by
>>having cells and pipeline stages and cascaded adders. Why not just count the
>>stupid bishops? Why is the idea of simply writing terms consistently escaping
>>you? Why do you keep talking about "emulation" and "porting"? Does the idea of
>>DB's eval running reasonably fast in software upset you so much that you're in
>>denial?
>>
>>I'm going home.
>>


No... your last paragraph says it all...  "I have no idea what hardware is all
about in this context."  Some terms _are_ simple to "write".  Others are not.

One example that you might follow:

Suppose I have a hardware engine that is based on Crafty.  I need a bitmap of
the occupied squares, but I want it rotated 90 degrees so that ranks become
files and files become ranks.  I can simply pass the original bitmap thru a
set of gates that will shuffle the bits as needed and compute the rotation in
less than one hardware cycle.  Pretty easy to visualize that, I hope.  But now
I want to do this in software.  How are you going to do this?  With 64 ANDS to
isolate each source bit, 64 shifts to shift them to their pre-assigned
destination in the rotated bitmap, then 64 ORS to merge them together?  While
my hardware solution took zero time?

That is the kind of problem someone would have after spending 10 years to
discover hardware tricks to do time-critical things in special purpose hardware,
and then trying to convert it to a pure C program.  Things that are fast in
hardware can be slow in software, while the reverse is not true.  And that
makes this a serious problem.

I am finished with this discussion.  They always go nowhere.  You may have the
last word, whatever that may be.

>>-Tom
fine, last word Tom Kerrigan 10:22:46 08/28/01
- Re: fine, last word Roy Eassa 12:32:23 08/28/01
  - Re: fine, last word Robert Hyatt 13:31:35 08/28/01
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.