Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Status of Brutus?

Author: Vincent Diepeveen

Date: 04:54:12 07/27/03

Go up one level in this thread


On July 27, 2003 at 06:39:56, Olaf Jenkner wrote:

>On July 26, 2003 at 18:11:31, Slater Wold wrote:
>
>>On July 26, 2003 at 17:22:02, Russell Reagan wrote:
>>
>>>On July 26, 2003 at 16:25:37, O. Veli wrote:
>>>
>>>>Since it is hardware, can
>>>>we expect to be stronger than top software?
>>>
>>>I would expect it to be slower than top software, because cpu improvements
>>>happen so quickly, and FPGA programming (from what I've heard) is not a simple
>>>task. If he spends another two years working on it before releasing it (as
>>>Slater said), just imagine how much faster the cpus will be by then.
>>>
>>>If you're talking about something massively parallel like Deep Blue, that is one
>>>thing, but a single PCI card? I doubt that is going to do any better than break
>>>even with top of the line hardware, so why bother? IBM threw so much hardware at
>>>the problem that desktop cpu improvements wouldn't catch up for a LONG time, but
>>>a single PCI card doesn't seem to be worth the trouble of programming the thing,
>>>because desktop/server cpus will probably outperform it before too long.
>>
>>Chrilly already has a project with a university to do 100+ PCI Brutus cards.
>>
>>It's hard to 'see' what your HW is doing, although Chrilly and Hsu had/have
>>educated guesses.  Chrilly's latest 'guess' was 3M nps.  Junior & Fritz get more
>>than that on pretty standard PCs nowadays.  Even Crafty could get that nowadays.
>> I truly believe that Brutus' eval is not as complex or as refined as *any* of
>>those.
>>
>>But the same technology that makes CPUs faster, make FPGAs bigger & faster too.
>
>Note that with FPGAs of the future the programm can run parallel at one chip.
>I.E. 10 times the size of the chip means 10 parallel units of Brutus.
>
>OJe

In theory yes, but it is theoretical bullshit of course to say so.

The problem is the inefficiency you search with in hardware. Donninger explains
that he cannot search more than 2 ply actually in hardware. If he goes 3 ply it
already gets very inefficient compared to software.

Not to mention the 4 ply that deep blue searched in hardware.

Note that Brutus seems to have killermoves, DB didn't even.

If you go put that in hardware in parallel, so press a chip or 2 at 1 fpga card,
then you have 2 problems
  a) how is the search efficiency
  b) how ever in your life are you going it to
     get parallel efficiently to work?

I will explain b, because in software the average search of diep the first few
positions that get divided is like 200 nodes a position.

At 10000 nodes a second a cpu that's 100 us a node.

So 200 nodes are like: 200 * 100 us = 20 ms.

With the latency of the memory that's no problem. Getting a cache line
from TERAS is like 6-7 us.

6-7 us is just a small part from 20 ms.

I need several and some synchronization then to get the thing to search.

Now suppose hardware.

I bet chrilly will get 20 million nps a cpu nearly (if the full cpu speed would
get used). A friend of mine told that for his medical application he uses 133Mhz
fpga cards now. It would be weird if chrilly is going to use the old 30Mhz stuff
(depending upon how long you compile of course. if you compile for a few days
perhaps it can get 33Mhz).

20 mln nps means 50 ns a node. Or in fact it is like 6-7 clocks a node i guess
(i have no real numbers here only some optimistic stuff chrilly told around).

How are you going to split within a few clocks?

Best regards,
Vincent



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.