Author: Ernst A. Heinz
Date: 19:37:11 01/26/00
Go up one level in this thread
On January 26, 2000 at 21:57:18, Peter W. Gillgasch wrote: > >On January 26, 2000 at 21:11:25, Ernst A. Heinz wrote: > >>On January 26, 2000 at 20:03:56, Peter W. Gillgasch wrote: >>> >>>On January 26, 2000 at 16:26:08, Robert Hyatt wrote: >>> >>>>[...] >>>> >>>>This would change if some of this stuff backs up into the software part of >>>>the search, of course... But we seem to be talking only about the q-search >>>>as implemented in hardware, and every node saved is N nanoseconds saved, >>>>period. >>> >>>Bob I really hate it when we share the same opinion 8^) >> >>No, we are not only talking about the quiescence search. We are talking >>about the last full-width plies (without hash tables!) plus the quiescence >>search. I do not really know how good the move ordering can be in this >>setting. > >Funny that you mention this. It can be absolutely terrific without >trans / ref and other memory intensive dynamic re-ordering means. VERY interesting -- do you intend to share your wisdom with the rest of us? Could it be SEE-augmented MVV/LVA ordering? >I have a program that can be compiled to work without any trans/ref >stuff and with the notable exception of endgames there is basically >no difference in terms of the branching factor or depth reached. I >was totally amazed by that. If they did things right the gains of >a trans/ref are really modest, especially if you consider the >vast cost of them. Sounds like you are still really deep into programming washing-machine CPUs ... :-) >>In his IEEE Micro article Hsu mentions an average cycle count of 10 per >>node. > >Hm, this is exactly in line with my 8 clocks for tree traversal plus >the 2 clocks I estimated for the fast eval and the cutoff decision :) Yes, the basic overall design pretty much resembles the predecessors. >>He times the "slow" part of the evaluation at an additional 11 cycles >>overall with a 3-cycle latency per column. It does not look like the "slow" >>part of the evaluation is further overlapped with other stuff. So it >>definitely hurts the NPS rate. > >This doesn´t say anything about your argument that the fail low nodes >matter. A "fast" fail low node costs 10 cycles to be traversed and >rejected, a "slow" fail low node costs 21 cycles if your information >is correct. If 20% of the nodes are of that nature and if I give you >the benefit of the doubt that there are some root positions in which >all nodes of that nature are "slow" and other positions where all nodes >of that nature are "fast" then the difference in terms of nodal rate >is in the 10 % range in this worst case scenario since 20 % of the >nodes roughly double in execution time. If there is a flux of 50 % >between "fast" and "slow" evals at those nodes we are at 5 %, which >means that the average time to process a node would raise from 10 >clock cycles to 10.5 clock cycles. I would call that pretty much >constant and "hurts" is probably too strong a word for that >difference. Okay, when it comes to discussing the relative percentages I agree. But now we are back at arguing about by *how much* the NPS rate might vary. My point was that it can vary *at all*. Moreover, the "fast"/"slow" difference does not only apply with "stand pat" *cutoffs* but also with the normal "stand pat" evaluation at terminal nodes. This potential slowdown certainly adds even more variety to the NPS rate of DB -- especially because Hsu writes the following in his IEEE Micro article. "The DB chess chips use a more elaborate evaluation scheme which disables lazy evaluation when an unusual position occurs." He goes on to name low-material endgame positions as belonging to this "unusual" kind. =Ernst=
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.