Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: HW based Crafty

Author: Robert Hyatt

Date: 11:37:35 03/30/02

Go up one level in this thread


On March 30, 2002 at 03:07:29, Slater Wold wrote:

>Dan Corbit once called Crafty the "N-Reactor of Chess Engines".  If this is
>true, I might be creating the worlds largest N-Reactor Chess Program.
>
>
>In the coming months, I will be working with a few people to create a hardware
>based move generator for Crafty.  I myself have written my own chess program
>over the last few years, however find it inadequate for this project, mostly
>because it's too simple.  (Man, I am a glutton.)  A 10M nps (basic) alpha/beta
>search will prove nothing, while a "tried and true" engine like Crafty will
>truly show the power of nodes.  How does a 2M nps Crafty compare with a 10M nps
>Crafty?  Well, that's my question!
>
>The hardware will consist of a single FPGA on a PCI card that will be inserted
>into the host computer.  The FPGA will be used for move ordering (and returning
>those moves in a predefined order) and generating all legal moves and passing
>them back to the software.
>
>My goal in this project is to answer the age-old question, which is better,
>quality or quantity?
>
>I will be using the version of Crafty that is newest release when I begin.  And
>all tests/comparisons will of course be done with the same version.  The true
>value of "hardware speedup" will be obvious here.
>
>My long-term goals are as follows:
>
>1.) to determine whether or not a significant nps increase will strengthen
>Crafty's performance by a considerable margin;
>
>2.) to determine the relation between Elo and nps;
>
>3.) to determine if greater nps actual make and engine "smarter"
>
>
>
>When the times comes, if Crafty shows a performance gain of greater than 100 Elo
>points, I will investigate further by creating more nps.  For example:
>
>If the standard version of Crafty is running on today?s top SMP machine (Dual
>AMD 1.73Ghz), and is estimated at 2300 Elo at 1.8M nps, and the HW based Crafty
>is running on a single FPGA (HW based move generator), and is estimated at 2400
>Elo at 10M nps, then what if we speed it up to 100M nps?  What will the
>estimated Elo be then?
>
>
>While not everything I find in this test will be consistent with other engines,
>it should give a good idea on what's to be expected.  There will surely be
>people who disagree and/or contest my findings; therefore I will try my best to
>document everything I have to support my findings (games, test suites, etc.).
>
>In the spirit of Crafty, everything that will be done will be open source, and
>available to anyone on the Internet.  It is also my intention to create a
>website dedicated to this project.  It will contain
>games/suites/sources/findings and everything else from this project.  However, I
>will not make the source and/or program available until I deem them suited to be
>released (in other words, working).
>
>The timetable looks like this:
>
>
>~3 months:  Setup the hardware (write a device driver for the PCI card).  Also
>work on a GUI for Crafty.
>
>~6 months:  Design the move generator.
>
>~2 months:  Integrate HW with SW.  (Complete the GUI)
>
>
>Hopefully before Christmas 2002, the HW based Crafty will be playing on Internet
>servers and data will be available.
>
>
>Any questions/comments/ideas are welcomed.  Anyone willing to offer something to
>this project will be welcomed with a smile, and a psychiatrist.  ;)
>
>I want to thank Robert Hyatt for Crafty, and everything he has done for me, and
>the Computer Chess community.
>
>
>
>Slater Wold
>swold@swbell.net



I don't want to tell you how to do this, but here are some ideas that are
based on an early version of "Belle"...

store the chess board (the bitboards) in the hardware.  Then make the FPGA
multi-purpose.  Design a "MakeMove()" chunk of hardware, an "UnMakeMove()"
chunk, and a "GenerateMoves()" chunk.  Don't forget the rotated bitmaps and
you will get a huge speed increase by driving the MakeMove/UnMakeMove/Movegen
parts of the execution time to near-zero...

You could then add an InCheck() chunk as well and drive that time to zero.
Doing the Search part is _very_ difficult, as would be the evaluation.  To
see how fast it could go, before jumping in head-first, do a profile.  Assume
anything you do in the FPGA goes to zero on the profile run.  If you dump 1/2
the work, it will only go twice as fast, so performance hitting 10M nodes per
second is going to be _very_ difficult without at least a FPGA evaluation
as well...




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.