Author: Sven Reichard
Date: 08:36:55 11/15/01
Although I always favored a clear structure over high speed, lately I
experimented with bitboards. I came across a pipeline bottleneck that I would
imagine today's compilers to avoid. However this doesn't seem to be the case, at
least not for gcc.
Processor: K6-2
Compiler: g++ -O6
(tried also -march=k6 and -march=pentium)
The following routine puts a piece on a square, updating the bitboards. It also
keeps track of first order evaluation. I have a global variable (actually static
to the class)
signed short values[MaxPiece][64];
The straighforward implementation looked like
setSquare(char sq, Piece p)
{
// <snip> manipulate bitboards
// and then
material += values[sq][p];
}
Changing that to
setSquare(char sq, Piece p)
{
signed short value = values[sq][p];
// <snip> manipulate bitboards
material += value;
}
gave a (more or less substantial) increase in speed.
(The reason, I suspect, is that values is not found in the cache, hence it has
to be fetched from memory, which takes a couple of cycles. We can use these
cycles to do other work.)
Has anybody had similar experience, or are there compilers out there that do
this kind of optimization automatically? Or is there some reason that this can't
be done?
Thanks for your input,
Sven.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.