Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: The man behind Rybka and his thoughts as posted here on CCC.

Author: Stuart Cracraft

Date: 11:40:02 01/01/06

Go up one level in this thread


On January 01, 2006 at 04:31:46, Vasik Rajlich wrote:

>On December 31, 2005 at 19:24:30, Mike Byrne wrote:
>
>>Vasik Rajlich , chess programming superstar in the making, has offered many of
>>his ideas on chess programming here on CCC - prior to his release of Rybka.
>>
>>What follws are his words extracted from the CCC archives.  Interesting stuff as
>>it provides clues into how he programmed Rybka.
>>
>>=========================================================================
>>"Interesting. You could also prepare some data structures in the positional
>>search and use them in the tactical search, a sort of root-node pre-processing
>>repeated in mid-search.
>>
>>What I'm curious about is, how expensive is a really heavy evaluation? I've put
>>into my engine a rather bloated eval and some conservative rules for doing only
>>lazy eval and still the eval takes up barely over 50% of the total search time.
>>This number seems hardly worth reducing - even reducing it to zero would only
>>double the nps, which is less than half an extra ply. It seems much more
>>important to improve the eval and to guide the search better than to worry about
>>a factor of 2 in the nps.
>>
>>Maybe I underestimate the size of a really good eval. (Mine is about 4 months
>>old.)
>>
>>Vas
>>"
>>==============================================================
>>"
>>This appears to be true for shallow searches. I've noticed that with very
>>shallow searches (like four or five ply) my program wants to play all sorts of
>>positionally weak moves which it doesn't want to play any more at seven or eight
>>ply. It seems that this also applies to extending deeper searches, according to
>>for example the SSDF results for different hardware. Clearly, a higher nps is
>>good.
>>
>>I am just wondering where exactly the line lies between maximizing speed and
>>adding knowledge. Considering the following hypothetical engines:
>>
>>engine A - spends 10% of its time in eval
>>engine B - spends 67% of its time in eval
>>
>>Engine A will have a 3x higher nps, so it will do slightly more than 1/2 ply
>>extra. Engine B will be spending 20x longer evaluating each position. It seems
>>that engine A will gain around 50 rating points from its deeper searches (at
>>least at time controls long enough that engine B can also make it to 8 or 9
>>ply). It's hard to believe that a few accurate eval computations couldn't
>>compensate for this.
>>
>>This would suggest that generally speaking it's quite acceptable to have some
>>expensive computations in the eval, provided you stay within some sort of an
>>acceptable limit, say not more than two-thirds of total search time.
>>
>>Vas
>>"
>>======================================================================
>>
>>"True, an evaluation which is both accurate and fast is ideal
>>
>>It would be interesting to know the exact profile of the time spent in eval of
>>the top programs. As I rememember Amir said that Junior spends "less than 10% in
>>most positions". Does this mean a lot of lazy evals and some very big evals? Or
>>a truly small evaluation which pinpoints the most important features of the
>>position and ignores the rest.
>>
>>In many cases, you arrive in eval needing only to show that the non-material
>>factors can't add up to more than X. (ie X = material score - alpha, or beta -
>>material score.) As X gets higher and higher, more and more things could be
>>ignored. Maybe some sort of a pinpointed lazy eval is the answer.
>>
>>It seems to me that the areas where an engine can get the most benefit are:
>>
>>A) Guiding search (ie reducing/extending): very very high
>>B) Quality of eval: very high
>>C) Overall speed, including speed of eval: medium/low, unless very big factors
>>are involved, such as building specialized hardware or getting a chance to run
>>on a big mp machine
>>
>>On the other hand, "A" and "B" are quite vague, you never really know if what
>>you did was an improvement, while "C" is very easy to measure.
>>
>>Vas
>>"
>>
>>==============================================================
>>"
>>Hello,
>>
>>It seems to me that there is a way to deal with this.
>>
>>Let's say you're at node A going to node B. You can statically evaluate node A,
>>statically evaluate the move (A->B) in the context of node A, apply the
>>appropriate reduction/extension, and enter node B with the already
>>extended/reduced search depth. In this case, node B will go into the hash table
>>with that new remaining search depth.
>>
>>If I understand the issue correctly, you'd also like to statically evaluate node
>>B before deciding on the reduction/extension for move (A->B). However, I am not
>>sure that that is necessary.
>>
>>Let's consider the typical node which is a candidate for reduction. It should
>>have one of two qualities: A) it is extremely static B) it looks unpromising, ie
>>score somewhat below alpha.
>>
>>In the case of very static nodes, it's pretty clear that you shouldn't need to
>>also look at the next move/node. In the case of an unpromising node, I was
>>thinking that you could do something like identify the few factors which could
>>change the evaluation. For example, if you're down two pawns but there are some
>>vague attacking prospects, then king safety would be that factor, and moves that
>>attack the king would be extended/not reduced, while moves which attack pawns on
>>the queenside are reduced. If you have an advanced passed pawn, then pushing
>>that pawn or supporting its advance would be the one way to bring the score
>>back, and moves which support that would be extended/not reduced. In all of
>>these cases, it seems to me to be enough to only consider the original node
>>(node A) and the moves themselves. The static evaluation of the next node can
>>then be done already knowing the new depth.
>>
>>Unfortunately I haven't started to implement this, so there might be some issues
>>here.
>>
>>At any rate, any reports on your experience in implementing this are welcome.
>>There appears to be almost nothing in the chess literature about this. The best
>>I found was Ed Schroeder's description of the pruning in Rebel, but he
>>concentrates on pruning positions with very bad scores rather than on pruning
>>irrelevant moves.
>>
>>It does also seem to me that this is "the key" to a good program, or at least
>>that it could be "a key".
>>
>>Vas
>>"
>>======================================================================
>>"
>>I don't know how much luck you had with the initial debugging of your program.
>>In my case I had to do some really tedious debugging involving stepping through
>>the tree node-by-node. One quite funny bug I had was occurring in the search
>>after 1. e4 e5. My program was analyzing the nonsense variation 2. Nf3 Nc6 3. d4
>>Bb4+ 4. Bd2 Nf6 5. Nxe5 Nxe4 6. Nxc6 Nxd2 7. Nxd8 Nxf1+ - and crashing.
>>Actually, from a computer chess point of view, this is a perfectly normal
>>variation I think, I doubt that there is any sense in trying to prune it.
>>
>>Basically, if you're going to search several million nodes, you will look at a
>>lot of junk. The point of a smart search, as I see it, is to try to somehow
>>increase the relevance of the variations very very slightly, maybe into the 0.1%
>>range, rather than 0.05%. This means that you can invest some of your available
>>nodes into some various "lottery tickets" variations, ie. probably won't work,
>>but you never know.
>>
>>Re. Qxf7, as you pointed out, the investment won't be so high because you will
>>be failing low throughout the entire resulting search. It will be even cheaper
>>than that because:
>>
>>1) It will fail low right after an immediate null move.
>>2) It removes material from the board, further reducing the branching factor for
>>that variation.
>>3) It creates a big material imbalance so you may be able to A) use lazy eval in
>>the subtree B) make some reductions later in the tree once things quiet down.
>>(For example, a quiet move with a big material deficit can be a reduction
>>criterion.)
>>4) There are only two replies, again reducing the cost.
>>
>>It seems that a reduction/extension mechanism shouldn't be trying to figure out
>>which moves are actually good. It should be trying to figure out which moves are
>>worth searching, based on:
>>
>>1) The very small chance that those moves will eventually be in the 0.001% of
>>moves which are either part of the PV or force a change in the PV, rather than
>>one of the 99.999% which could have been skipped.
>>2) The expected cost of the search.
>>
>>Cheers,
>>Vas
>>"
>>========================================================================
>>"I more or less agree with this, but there is one caveat: when you first begin,
>>you have to decide whether to use bitboards or not. If you don't, you'll never
>>switch to them without a complete rewrite. Based on my experience so far I would
>>strongly recommend using bitboards. They are not that complicated, they were not
>>so difficult to debug (as I had slightly feared), and quite a number of
>>evaluation terms can be very elegantly & efficiently expressed with them.
>>
>>For example, I use a bunch of precomputed masks to test for various pawn & piece
>>configuration around the king. I have offline code which precomputes each of
>>these masks, and creates a .h file of "const BitBoard weak_sq_g4_p = 0x..", and
>>when each of these is used in eval the compiler will substitute it at compile
>>time.
>>
>>Vas
>>"
>>=========================================================================
>>"I have the theory that the greater your search resources (ie combination of
>>time
>>and hardware), the less important is the search, and the more important is the
>>evaluation. I have played around quite a bit with various extensions recently
>>and they usually are quite good in testsuites, but not at all clear in real
>>games. It's very easy to have side effects from extensions. At short time
>>controls, you need to use extensions and reductions in order to pay special
>>attention to tactics, to avoid getting killed tactically. At some point, though,
>>you can search enough nodes that the tactics are naturally taken care of by a
>>plain search, which also naturally takes care of good positional play.
>>
>>Of course a good clever (ie extending & reducing) search must always be better.
>>Just something to keep in mind - in the future, hardware will improve, and the
>>most important games will always have longer time controls and bigger hardware.
>>
>>Vas
>>"
>>=====================================================================
>>
>>There's much more there too -- I just selected teh more interesting (for me)
>>tidbits.
>>
>>Best,
>>
>>Michael
>
>Michael,
>
>aahhhh - it's always interesting to dig up old notes.
>
>Just a little disclaimer - posting here is a form of thinking. In other words,
>what is posted here is from people who feel that they don't yet understand the
>topic. As soon as they do, the posting stops :-)
>
>Vas

Sounds like you are speaking for yourself and some others and I understand
that.

But certainly this won't apply to some people like Bob Hyatt, Uri Blass,
Vincent Diepeeven and others who understand and continue posting to help
others.

It's a tradeoff of course in time and, for now, you ride the crest and
the wave at the top. Savor your experience first. But I doubt you have
time!!!

Enjoy,

Stuart



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.