Author: Robert Hyatt
Date: 10:25:56 08/21/02
Go up one level in this thread
On August 21, 2002 at 11:32:17, Gian-Carlo Pascutto wrote: >On August 21, 2002 at 11:21:18, Robert Hyatt wrote: > >>>They optimistically estimated their number of nodes at 126 million >>>nodes a second. From that about 20% were *effective* the other 80% >>>was lost by parallellism, as you can see in the document. >> >>That is also wrong. He said the search was 7% effective, not 20%. And >>as I said, when I talked to him he used 7% as a number to be used against >>the peak performance of the machine, which was 1B nodes per second based >>on simple math. That turns into maybe 70M nodes per second in terms of >>single-processor equivalent. > >This is new to me. You're saying it only did an effective 70Mnps? > >IIRC, Hsu's DBTE was _supposed_ to be >20% effective on 1024 cpus. > >-- >GCP Hsu reported several measures. From a recent article he claimed 7% overall, which initially sounds bad, but in reality is not. There are two issues: 1. RAW nps. As in Crafty's RAW NPS values. My NPS numbers generally scale right with the number of processors. IE 4 processors produces about 4X the RAW NPS number. DB hit about 20% based on numbers Hsu has published, ie he claimed they averaged about 200M for the 1997 match. 2. Effective NPS. This is harder. And this is where we came in with the discussion on speedup last week. IE for Crafty, 4 cpus is roughly 3x faster for a .75% efficiency there. Or, put another way, 25% of the parallel search on a 4-cpu machine (one cpu's worth) is search overhead - nodes that a serial search would not have to search. Hsu obviously had this kind of overhead as well. The last paper I read claimed 7% _overall_ efficiency... which would be .07 * (240*2M + 240*2.4M) nps... or roughly as fast as a 74M nps search on a serial (single cpu) machine. 240*2M comes from 1/2 of the processors running at 20 mhz, for 2.0M nodes per second, 240*2.4M comes from the other 1/2 of the processors running at 24mhz or 2.4M nodes per second... They do roughly 70M _effective_ NPS if you use those numbers. Crafty does maybe 1M on my quad xeon factoring out that .25% loss. That gives some scale to pure NPS numbers. IE they are almost 100X faster than me on raw NPS, but then they are probably another 10x faster in their evaluation, doing things I can't afford. A pretty significant computational advantage, regardless of what _some_ might think...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.