Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Hydr's KN/s

Author: Vincent Diepeveen
Date: 01:05:46 07/08/05
On July 07, 2005 at 15:43:03, Dann Corbit wrote:

>On July 07, 2005 at 15:01:26, Dann Corbit wrote:
>
>>On July 07, 2005 at 14:51:56, Gian-Carlo Pascutto wrote:
>>
>>>On July 07, 2005 at 14:37:19, Dann Corbit wrote:
>>>
>>>>On July 07, 2005 at 14:14:36, Gian-Carlo Pascutto wrote:
>>>>
>>>>>On July 07, 2005 at 13:56:04, Dann Corbit wrote:
>>>>>
>>>>>>On July 07, 2005 at 05:05:50, Gian-Carlo Pascutto wrote:
>>>>>>
>>>>>>>On July 05, 2005 at 14:37:46, Dann Corbit wrote:
>>>>>>>
>>>>>>>>The logfile does not consider the depth on-chip at the leaves.  About 6 plies
>>>>>>>>more.  So consider it really to be 16-18 plies.
>>>>>>>
>>>>>>>This is quite simply completely wrong, and contradicts what Hsu and Campbell
>>>>>>>published.
>>>>>>>
>>>>>>>http://sjeng.org/ftp/deepblue.pdf
>>>>>>
>>>>>>I read the paper.  I was referring to this:
>>>>>>"This typically results in 4- or 5-ply searches plus quiescence in middlegame
>>>>>>positions and somewhat deeper searches in endgames."
>>>>>>
>>>>>>I did not see the contradiction.  Can you please point it out ot me?
>>>>>
>>>>>The first number in the logs is the combined depth (excluding quiescence, but
>>>>>nobody counts that). The nominal depth was around 12 ply for the combined
>>>>>search, not 16-18.
>>>>
>>>>Then it represents the estimated maximum combined depth (last column of table
>>>>2)?
>>>
>>>No, that's another matter. Maximum depth is rather meaningless.
>>>
>>>Look at Page 5, 1)b)  for the statement that the nominal depth is 12 ply on
>>>average. It's been a while since I read it but basically something like 12 (5)
>>>meant 12 - 5 = 7 ply software, 5 ply hardware, and then extensions and quiescene
>>>search.
>>
>>It makes me wonder why they got such excellent answers, then.
>
>If they could average 100M NPS, then a 3 minute search (40/2 average) would give
>18,000,000,000 {18 billion} nodes and 36 billion at 200 M (and I seem to recall
>a theoretical peak NPS rate of 1 billion).
>
>Since 6^12 = 2,176,782,336  [assuming a branching factor of 6 for pure
>alpha-beta with no pruning whatsoever, no null move, and with 36 moves average
>at each level] a 12 ply search should have taken only 21 seconds at 100 M NPS
>and 10.5 seconds at 200M.
>
>The math does not make sense to me.

The math does make sense.

Please calculate with me.

a) db did do singular extensions
b) db forward prune nowhere
c) db uses NO killermoves in hardware
d) db do 4 ply in hardware
e) db had UGLY move ordering in hardware
f) db did do SE first ply in hardware
g) db did do checks in qsearch

So please do a 4 ply search without hashtables, without nullmove, without
killermoves, without doing all captures first. Just random search of 4 ply in
hardware and first move also a SE search.

Chrilly very soon concluded that he could not do a big search in hardware but a
maximum of 2 ply was really the limit. Then later on Chrilly added killermoves
and ordered captures first. As that generated too many software nodes per
second, he had to do 3 ply. Chrilly added forward pruning in hardware. Very
ugly, and making it tactical very weak.

It's very rude the forward pruning he's doing. Something like:
  if( eval >= beta ) return;

This is why Hydra can do 2-3 ply, usually 2 ply searches, with nullmove, in
hardware.

4 ply searches without nullmove and without forward pruning is just too ugly.

f) db  didn't get 200 million nps of course, later they estimated 130 mln nps
   on average. I doubt that number. At such a slow bad ugly cluster they ran
   at, using their primitive way to do a search, it's very hard to put to work
   128 processors, let alone 480.

g) db team did only have 2 weeks to get it to work parallel with 480 processors
   that's very little time. their main concern will have been to let n
   cpu's run without the other cpu's delay the running cpu's.
   clusters are real real ugly to put other cpu's to work.

h) db team claimed 15% speedup when comparing the speedup of 1 asic
   at 1 processor with 32 asics at 1 processor.
   You must add to that however another big loss when you move
   from 1 processor to 30 processors in software.
   Later on hyatt posted a number here of 5% speedup which DB team admitted as
   their parallel speedup.
   However reality is worse. If you get 15% speedup at 1 processor and another
   time 15% at 30 processors then the effective speedup
   is 2.25% out of 130mln nps

i) SE depth reduction used by Diep is 3 ply. DB used 2 ply.

i) a simulation of their search has been done with Diep. with R=3 for singular
extensions. Note i DID order the moves the last 4 ply ideally with my move
ordering. I just didn't use hashtables last 4 ply and i didn't use killermoves
last 4 ply. Diep needs then in order to get 10 ply already 10 billion nodes.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.