Author: Robert Hyatt
Date: 08:32:53 11/04/02
Go up one level in this thread
On November 03, 2002 at 10:41:17, Gian-Carlo Pascutto wrote: >On November 03, 2002 at 10:07:07, Vincent Diepeveen wrote: > >>On November 02, 2002 at 00:10:17, Robert Hyatt wrote: >> >>At the P4 with 1 decoder, 12K i cache and just 8KB data cache >>i could measure no speedup. Only slow downs if i tried to run >>too many threads. >> >>Your claims with crafty proofs it fits within the trace cache somehow. > >It does almost 10x as much NPS as your thing. > >An obvious effect: > >It spends (relatively!) much more time waiting for >transposition table entries to come out of main memory > >If it spends 20% of it's time for this (a realistic number >on a high end P4) and the parallel speedup is 1.7 then it >is going to run about 5% faster with SMT, roughly. Where does that "math" come from? (5%) I have seen a 30% improvement in NPS using hyper-threading on a 2.2ghz PIV. That should translate into a roughly 20% improvement in search speed to a specific depth. That seemed to be close to the numbers Eugene posted as well. I can run some specific tests on the dual 2.2 if you want to see actual results. It isn't so easy since it isn't in my office and I can't "telnet" to it to run things, but I can walk over there and download the latest executable. Hyperthreading isn't great. But it is free and certainly better than nothing, even if it does make a 4-processor machine look pretty strange with 8 processors. Once I have time to fiddle with the locks to add the pause, I expect even better performance... > >Crafty doesn't fit in the trace cache - it's bitboards >with not quite compact code. Inferior datastructure and >all that :) > >-- >GCP
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.