Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: AMD Opteron 275 2.20 GHz

Author: Robert Hyatt
Date: 08:48:19 08/24/05
On August 24, 2005 at 09:24:30, Tord Romstad wrote:

>On August 24, 2005 at 02:20:40, Joost Buijs wrote:
>
>>I used some positions from the "wm_test" and recorded how long it took to solve
>>them.
>
>I don't know much about parallell search, but intuitively I would expect test
>suites (especially tactical ones) to be a poor way to measure the speedup with
>multiple CPUs.  The search tree in a typical test position will often look very
>different from a normal search tree, and it is possible that the parallell
>search
>efficiency in such positions is very different from the efficiency in average
>positions reached during normal games.
>
>Tord


I agree.  There are many issues.  Some positions, once you find the correct key
move, produce nearly perfectly ordered trees.  Parallel searches can usually eat
those up with high efficiency.  Other position produce horribly ordered trees,
and depending on the parallel search, they can eat these up or fall flat and
produce horrible performance.  The ones in the middle are harder, as move
ordering is never perfect, and that can have a really bad effect on parallel
search.

That's the reason that for the older DTS paper, I chose to use positions from a
real game, that didn't all have a "tactical solution" to search for.  Because I
kept getting asked "how does your parallel search perform in a game, not just on
a set of random test positions?"

That's not an easy question to answer.  And if you do, someone will always
criticize the result and ask "OK, but how does it perform on a set of unrelated
test positions like Nolot or whatever?"  :)

Also, as I have shown repeatedly over the past few years here, parallel
performance has a _large_ standard deviation on many positions, which means that
picking just one or two and running each one time with 1 cpu and one time with 8
cpus is a poor way to measure performance.

The data I will provide before long takes the same Cray Blitz game positions
from the DTS paper, and runs them each 8 times and averages the speedup.  That
means I get to run 1 run with one cpu, then 4 runs with 2 cpus, and 8 runs with
4 and 8 cpus, to try to have enough data so that the variance gets averaged out
to something meaningful.  That burns one heck of a lot of CPU time...  I am also
running these tests with slightly different internal tuning options as well, and
if I just change two of the tuning parameters, and only use two possible
settings for each, now I have 4X as many runs to make.

That's why this quad opteron is glowing orange in AMD's lab right now.  :)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.