Author: Vincent Diepeveen
Date: 17:43:30 03/26/01
Go up one level in this thread
On March 26, 2001 at 20:23:16, Dan Andersson wrote:
>I'm always interested to know more when people claim super linear improvements,
>there is usually some hidden bottlenecks int those cases. What kind of metrics
>do you use when you claim a two ply depth increase in search depth, a test suite
>perhaps. Naked eye measurements are notoriously inaccurate, especially whwn
>confronted with discrete changes. It's all to easy to round in favour of the
>hytotesis. Maybe its so that your program model fit more easily into a resource
>rich environment.
>
>Regards Dan Andersson
I have log files from the IPCCC2001.
If i search with the same version dual then i'm exactly 2 times
slower on a PIII800 as on the quad i used for the IPCCC.
The arguments are
a) at my dual i have 150mb hashtable at the quad 350mb
b) Xeon cpu has 1 MB l1 cache the dual has 256,
so let's zoom into that 256kb P3 cache. 2 PROCESSES must get loaded
into that poor small cache. So that's 128kb for each cpu.
Now on the Xeon i have 256kb a cpu. For diep this is important.
For multithreaded progs and progs that use small memory for their
search and eval this is no big deal either.
c) linux is a very stable system. It is a pain to get something
parallel to work on it very well because of the gcc compiler,
but it simply runs on and on and on.
d) memory on a quad is PARALLEL. It is very true that memory is
NOT parallel on a dual. Memory is not 4 times faster on
a quad because of latency as it is single cpu,
but very close to 4 times faster for my program.
e) all the above is not valid on a dual of course
But now let's simply look to the improvements in rating. My bottom
remark here is that it's important to know what search depth you
search. For me searching more than 12 ply usually makes little sense,
and nealry all tactical tricks are already found at that depth anyway.
However dual i sometimes do not get further as 10 ply. Quad for
sure removes that bottleneck.
In important positions, and i consider everything important, the
thing is that if you search 1 time very little depth, like 1 ply,
and all other moves 15 ply, then you might lose because of a 1 ply
search depth.
I noticed on Bob's quad that in difficult positions where the program
doubts, that its speedup in some positions is not too brilliant,
but that in those difficult positions the speedup is VERY good
(especially when root score is dropping).
It is especially here where i see the 2 ply difference. TWO PLIES IS A LOT!
In other positions like where you (re)capture a piece, then the
speedup sure is less as 2 plies. Single cpu goes deeper then.
The same bottleneck applies also to searching dual versus single cpu.
So let's clearly discriminate between the next different things
a) practical speedup with filled hashtables
b) theoretical average speedup with cleared hashtables
Do you see the difference between the above 2?
For me only a is important during a game. b is completely insignificant
for me. FOR SURE I DO NOT GET A 4.0 speedup in the average game position
on a quad with cleared hashtables.
But point is: i always start my seach with FILLED hashtables.
Then splitting goes MUCH MUCH better. So speedup is much much better!
Also IMPORTANT to realize is that speedup is actual more as 4.0,
because it's 2 plies. For me a ply isn't 2.0 speed.
The reason partly is also the doubting. If my program doubts in a normal
search single cpu on a depth say 8, then i need hell of a lot of nodes
for researches. Don't know what is the case with you, but with me that
is the case.
Now with a quad you doubt less as single cpu. Of course this also partly
applies to a dual.
Anyway, the 2 plies speedup is not fair if you realize i compare
DIFFERENT SIZES OF HASHTABLES!
i compare 150mb hash with 350mb hash. Quite a difference with filled
hashtables!!!!!!!!
Is it fair to compare the 2?
If not, is it fair to say a quad is not going to give a 2 ply speedup,
because in REALITY a quad has always more RAM as i can affort in my
dual cpu machine.
Now you'll say 512mb memory is very affordable today.
Well you're right, but tomorrows quad i might use will most likely have
a couple of gigabytes of RAM.
Quads are on the edge of what is affordable always, so the comparision
is never fair of course. My wallet versus a big company / university.
I always lose of course!
So the theoretic speedup is indeed not so good. The practical speedup
is always going to be HUGE!
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.