Author: Steffan Westcott
Date: 07:03:55 02/14/04
Go up one level in this thread
On February 14, 2004 at 00:24:33, Luis Smith wrote: >Do you know what speed Brutus/Hydra runs on one of those FPGA >cards compared to the 3.06 chip? I imagine the FPGA chip would be signifigantly >faster than a Xeon processor. Comparing speeds of an FPGA (or ASIC) application (eg chess) to a CPU running application software is not straightforward at all. Both contain the concept of a hardware clock (or clocks), but it is meaningless to compare clock speeds, as this does not measure the amount of useful work done per clock cycle (Incidentally, FPGA applications usually run at a clock frequency of around 50MHz - 100MHz or so, but this is a gross generalisation). An area where this difference is most obvious is chess position evaluation. A CPU chess program would include some CPU instructions to evaluate properties of a chess position, and most likely produce a score, among other results. To 'add chess knowledge' to the CPU chess program, more detailed properties about the chess position are sought, so more CPU instructions are added to the program to evaluate them. These extra instructions mean more CPU clock cycles are needed to perform a full position evaluation, on average. A hardware (FPGA, ASIC) chess application could be implemented in many ways. The evaluation portion satisfies the same requirement to evaluate properties of a chess position, produce scores and other results and so on, but it is not restricted to the CPU model of executing an instruction stream to achieve it. One approach is to present the entire chess position as an input to a custom logic function, which produces its results in one clock cycle. This is not the most viable approach however, as the logic function would be extremely large, complex and deep with a low maximum clock frequency. Another approach is to pipeline the evaluation over 8 clock cycles, performing a file-wise sweep over the chess position. This will reduce logic size, complexity and depth, and increase maximum clock frequency, but also increase the number of clock cycles needed (and perhaps latency). To 'add chess knowledge' to either of the hardware based approaches, more logic terms are added to the logic function. These extra logic terms mean greater logic size, and perhaps a small impact on clock frequency, but no change in the number of clock cycles needed to perform a full position evaluation. Given the above explanation that comparing clock speeds between CPU and FPGA/ASIC chess applications is not useful, it might be tempting to compare the rate of chess position evaluations instead (aka "nodes per second"). Unfortunately, this too is not too helpful either, as chess programs in general vary in the quantity and quality of chess evaluations they perform. Cheers, Steffan
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.