Computer Chess Club Archives


Search

Terms

Messages

Subject: <slightly OT> Optimized Instruction Scheduling

Author: Sven Reichard

Date: 08:36:55 11/15/01


Although I always favored a clear structure over high speed, lately I
experimented with bitboards. I came across a pipeline bottleneck that I would
imagine today's compilers to avoid. However this doesn't seem to be the case, at
least not for gcc.

Processor: K6-2
Compiler: g++ -O6
(tried also -march=k6 and -march=pentium)
The following routine puts a piece on a square, updating the bitboards. It also
keeps track of first order evaluation. I have a global variable (actually static
to the class)

signed short values[MaxPiece][64];

The straighforward implementation looked like

setSquare(char sq, Piece p)
{
	// <snip> manipulate bitboards
	// and then
	material += values[sq][p];
}

Changing that to

setSquare(char sq, Piece p)
{
	signed short value = values[sq][p];
	// <snip> manipulate bitboards
	material += value;
}
gave a (more or less substantial) increase in speed.
(The reason, I suspect, is that values is not found in the cache, hence it has
to be fetched from memory, which takes a couple of cycles. We can use these
cycles to do other work.)
Has anybody had similar experience, or are there compilers out there that do
this kind of optimization automatically? Or is there some reason that this can't
be done?

Thanks for your input,
Sven.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.