Author: Steven Edwards
Date: 06:35:39 08/22/03
The CT toolkit has a CTMachine class that handles items specific to the host processor and this includes a mechanism for counting the number of machine clock cycles required for executing a specified region of code. I have used this to compare the bitboard mode move path enumeration class speed performance between an Intel Pentium 3 (1.133 GHz on a dual CPU Intel SCB2 rackmount) and an IBM PowerPC G4 (800 MHz on an Apple Macintosh notebook). The P3 machine is running RH9 Linux and the G4 machine is running OpenBSD, both with the latest g++ compiler. The G4 wins the efficiency match over the P3 by a ratio of six to five. E.g., a bitboard enumeration node that takes 3000 P3 clock cycles needs only 2500 clocks on the G4. Certainly there are other factors involved. But the above helps confirm my thought that the PPC architecture, being relatively free from the ball-and-chain of legacy CISC issues, beats Intel/AMD. With the new PPC970 chips coming out, I expect the efficiency factor to increase. With thirty-two general purpose sixty-four bit registers per CPU, the chip is a natural for bitboard chess applications. An assembly language coder could keep a good sized chunk of the current position bitboard database in registers through out the whole program. Also, each PPC970 has an integrated AltiVec unit with 128 four byte registers; normally used for parallel floating point vector operations, it's probable that the entire AttackFrSq[] (or the AttackToSq[]) bitboard database can use it as an extremely fast cache.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.