Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Status of Brutus?

Author: Vincent Diepeveen

Date: 05:06:19 07/27/03

Go up one level in this thread


On July 27, 2003 at 07:46:15, Joost Buijs wrote:

>On July 27, 2003 at 06:39:56, Olaf Jenkner wrote:
>
>>On July 26, 2003 at 18:11:31, Slater Wold wrote:
>>
>>>On July 26, 2003 at 17:22:02, Russell Reagan wrote:
>>>
>>>>On July 26, 2003 at 16:25:37, O. Veli wrote:
>>>>
>>>>>Since it is hardware, can
>>>>>we expect to be stronger than top software?
>>>>
>>>>I would expect it to be slower than top software, because cpu improvements
>>>>happen so quickly, and FPGA programming (from what I've heard) is not a simple
>>>>task. If he spends another two years working on it before releasing it (as
>>>>Slater said), just imagine how much faster the cpus will be by then.
>>>>
>>>>If you're talking about something massively parallel like Deep Blue, that is one
>>>>thing, but a single PCI card? I doubt that is going to do any better than break
>>>>even with top of the line hardware, so why bother? IBM threw so much hardware at
>>>>the problem that desktop cpu improvements wouldn't catch up for a LONG time, but
>>>>a single PCI card doesn't seem to be worth the trouble of programming the thing,
>>>>because desktop/server cpus will probably outperform it before too long.
>>>
>>>Chrilly already has a project with a university to do 100+ PCI Brutus cards.
>>>
>>>It's hard to 'see' what your HW is doing, although Chrilly and Hsu had/have
>>>educated guesses.  Chrilly's latest 'guess' was 3M nps.  Junior & Fritz get more
>>>than that on pretty standard PCs nowadays.  Even Crafty could get that nowadays.
>>> I truly believe that Brutus' eval is not as complex or as refined as *any* of
>>>those.
>>>
>>>But the same technology that makes CPUs faster, make FPGAs bigger & faster too.
>>
>>Note that with FPGAs of the future the programm can run parallel at one chip.
>>I.E. 10 times the size of the chip means 10 parallel units of Brutus.
>>
>>OJe
>
>I don't think Brutus actually runs on the FPGA chip. Probably Brutus runs on the
>PC and uses the FPGA for move generation and evaluation purposes only. The whole
>problem with this scheme is the low speed of the current PCI bus.

Brutus has the paderborn cluster. So if he puts 1 card in each PC there and then
gets the software to work over the cluster then he's there.

each fpga card does a very inefficient 2 or 3 ply search at a time.

If 1 card is 133Mhz then we talk about 20 mln a second theoretically a card.
Chrilly is forced then to do inefficient 3 ply searches otherwise it gets too
many searches a second.

With 2 to 3 ply searches at 33Mhz it was already like 90000-100000 searches a
seconds at world champs 2002.

Now if card gets 4 times faster he will get above latency of PCI bus.

So let's say 4 ply in hardware and still 100k searches a second.

However i doubt, despite 20 years of time to get it to work, whether in
paderborn they can get over a cluster to work a search with 100k searches a
second a machine.

I REALLY DOUBT IT.

Zugzwang had like 2k-4k nps a cpu effectively at a Cray T3E machine. this
machine has a way better latency than the paderborn cluster.

Only DIEP would work at their cluster which needs like 16 us to get a single
position. Probably more when all cpu's get used and are busy communicating with
each other.

Now in Paderborn the problem is of course, in contradiction to the TERAS
supercomputer and in contradiction to the T3E that they have that 16 us latency
to get a single position also when running just at 2 nodes.

Note that i am not sure that from the 133Mhz cards they can get a 100 or so like
Wuellenweber told back in juli 2002 to me.

Anyway within such a short period of time the Paderborn university will not
manage to get 100 cards to work anyway.

I found out myself that there is a few phases.

Phase 1 is to get it dual to work.
Phase 2 is to run at 4-16 processors
Phase 3 is to run at 32-64 processors
Phase 4 is to run at 128-500 processors

It really depends upon for what algorithm you go.

The parallel problems really only start at phase 3.

Best regards,
Vincent





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.