Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: hardware math

Author: Vincent Diepeveen

Date: 10:04:11 10/11/02

Go up one level in this thread


On October 11, 2002 at 12:12:18, Robert Hyatt wrote:

>On October 11, 2002 at 11:07:52, Vincent Diepeveen wrote:
>
>>On October 11, 2002 at 10:38:12, Jeremiah Penery wrote:
>>
>>>Hmm, let's see.  If DB gets 'upgraded to 2002 standards", that would mean they
>>>can make a fully custom .13 micron chip running at 300MHz, able to do a full
>>>evaluation every clock cycle.  It will also have 20GB/s memory bandwidth to
>>>256MB of RAM for the hash tables on the board.  So one single chip will search
>>>300M positions/second, and they can do whatever evaluation they want.  Yes, yes,
>>>obviously a 'complete joke'.
>>
>>I'm more afraid for Brutus in like 30Mhz FPGA than i am for a
>>deep blue at 0.13 micron.
>>
>>First of all, deep blue wasn't written in verilog or any 'high level'
>>language. It was simply cut'n pasting the logics to each other.

>Vincent, Hsu used "Project MOSIS" funded by the NSF, to produce his chess chips
>for the
>first few versions.

>Please do a web search on MOSIS and look at their requirements as to how you
>submit
>something to them to fabricate.
>
>Then come back and say "my statement was stupid, I learned a new word "verilog"
>and
>used it without knowing what I was talking about."  Because that statement is
>completely
>true.  You _must_ use software design tools to submit something to MOSIS.  You
>think
>you just give them a picture and say "fab me one of these?"

>Stick to an area where you have at least some small idea of what you are talking
>about.  If
>you want proof that they used MOSIS, read his book.  Or ask him directly.  But
>please stop
>spewing random noise.

What i meant is that he would need a complete redesign of the old chips
to get them into 0.13, then we didn't talk about a year or 5 needed to
improve his evaluation function. From my draughtsprogram i know how slow
it goes improving an evaluation function if you do not know shit from
the game in question, this despite having someone near me (2 streets away:
Marcel Monteba) who is very knowledgeable in draughts.

>>
>>So it would require an entire new design to make something for 0.13
>>in verilog or whatever.
>>
>>Secondly, that 0.13 process technology including the big salary from Hsu
>>would be around 20 million of investments.
>
>No it wouldn't.  Hsu would not need to do anything other than re-do the design
>and submit it to an existing fab shop to produce the chips.  It wouldn't be
>cheap,
>but it wouldn't cost a fortune either.  The only cost would be Hsu's salary,
>and the fab cost for a run of N chips, where N would probably need to be at
>least
>1000.  I don't claim to have an idea of what the cost would be, as IC
>fabrication is
>not something I follow closely.  But it _would_ be fast as all hell, because
>rather than
>20mhz they could go 100X faster with no problems at all, and probably do a
>better
>design since the DB chips had to make concessions for routing and gate delays
>that
>could be better handled today.

I'm not assuming the 'university' deal here, but a commercial release
of his cpu. Of course he would need to press like 100000 cpu's or so.
Times say 50 dollar a cpu = 5 million dollar. To start with.

That's excluding of course paying for the machines, the expensive software
to recalculate all his stuff for the 0.13 process and his own salary for
5 years as well as many advisors and other idiots who want to eat from the
project.

You get to 10+ MLN directly and go to 20MLN dollar pretty quick.

On the other side, instead of begging each year for sponsoring at many
companies, he could also buy for a couple of thousands an FPGA board,
rewrite his stuff to verilog or whatever language you want to and
he just needs half an hour to synthese the code to the cpu.

It's ready to get tested then within half an hour after a source code
change!

Way less expensive than the 0.13 option :)

So who is stopping Hsu from putting it into FPGA?

If i understand well it's a peanut to get it to 30Mhz there and
not so expensive to buy a few more cards. Total budget perhaps 10000 dollar.

Why beg for so much money to make asic cpu's?

Let him proof how the thing plays first against other programs using
a single 20-24 or even 30Mhz FPGA cpu.

No need for 20 million for a 0.13 release.

>>This versus a FPGA board with some tools you can get for a couple of thousands
>>of euro's (1 euro = 1 dollar at the moment about).
>>
>>Further, Hsu would have to proof a number of things
>>   being capable of implementing all kind of things like
>>   nullmove, efficient move ordering, and a lot of evaluative
>>   things in hardware. it's not trivial to add ram to the
>>   chip, because a single cacheline from RAM is a lot slower than
>>   processing a bunch of nodes in hardware. If you run at 300Mhz
>>   with say 10 clocks a node on average, you can achieve about
>>   30 million nodes a second.
>
>It is trivial to add RAM to a chip.  SRAM.  With today's densities, significant
>SRAM is
>common.  L1/L2/L3 cache comes to mind.
>
>
>>
>>   However you can't do 30 million random word lookups a second in
>>   the RAM. latency is too big for that. It's not trivial to combine
>>   the 2 things.
>>
>
>
>
>No, but you _can_ do asynchronous lookups.  Start the probe, continue the
>search,
>when the probe result is ready, it can either be ignored if it is worthless, or
>you can
>back up to the point you did the probe and use the info.  Have you even read any
>of
>the literature on distributed hash tables in distributed chess engines?
>
>I didn't think so...

We discussed this extensively a year ago or so, in case you forgot.
My question to you is then: if hashtables slow down crafty so much,
why aren't you doing this in crafty?

Right, it's combining the worst of both worlds :)

>
>
>
>
>
>>   In fact crafty with 1 million nodes a second can't even do all requests
>>   to a hashtable.
>
>What on earth does that mean?  It _does_ do them.  I've even been testing
>hashing in the
>q-search again, just to see if it is worthwhile after a few years of not doing
>it.  It slowed me
>down about 10%.  It made the tree about 10% smaller.  On my quad xeon I went
>from 1.6M
>nodes per second to 1.4 roughly.
>
>So I have no idea what the above statement you made means...

>Cray Blitz probed _everywhere_ and it had no problem running at 5-7M nodes per
>second
>on a T932...

Don't compare supercomputers with a program on a single cpu and a single
memory controller.

>
>
>
>>
>>An important point in the end is the price where this all gets produced for,
>>because you need to sell a bunch of these processors, or you won't get
>>back that $20 million of investments.
>>
>>And in the end, when the cpu hits the market after say a year or 5,
>>then i'll be having a 4 processor 10Ghz intel/amd machine delivering
>>millions of nodes a second for DIEP :)
>>
>>>>Of course it gets completely annihilated when appearing in 1997 standards.
>>>>
>>>>So if Hsu upgrades his chip to a single cpu chip with a new and better
>>>>evaluation (it's of course questionable whether he is capable of
>>>>manaqing that) then it will not search deeper than deep blue in 1997
>>>>of course, unless he adds nullmove and hashtables.
>>>
>>>The above paragraph has no basis in reality.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.