Computer Chess Club Archives




Subject: Re: The man behind Rybka and his thoughts as posted here on CCC.

Author: Vasik Rajlich

Date: 01:31:46 01/01/06

Go up one level in this thread

On December 31, 2005 at 19:24:30, Mike Byrne wrote:

>Vasik Rajlich , chess programming superstar in the making, has offered many of
>his ideas on chess programming here on CCC - prior to his release of Rybka.
>What follws are his words extracted from the CCC archives.  Interesting stuff as
>it provides clues into how he programmed Rybka.
>"Interesting. You could also prepare some data structures in the positional
>search and use them in the tactical search, a sort of root-node pre-processing
>repeated in mid-search.
>What I'm curious about is, how expensive is a really heavy evaluation? I've put
>into my engine a rather bloated eval and some conservative rules for doing only
>lazy eval and still the eval takes up barely over 50% of the total search time.
>This number seems hardly worth reducing - even reducing it to zero would only
>double the nps, which is less than half an extra ply. It seems much more
>important to improve the eval and to guide the search better than to worry about
>a factor of 2 in the nps.
>Maybe I underestimate the size of a really good eval. (Mine is about 4 months
>This appears to be true for shallow searches. I've noticed that with very
>shallow searches (like four or five ply) my program wants to play all sorts of
>positionally weak moves which it doesn't want to play any more at seven or eight
>ply. It seems that this also applies to extending deeper searches, according to
>for example the SSDF results for different hardware. Clearly, a higher nps is
>I am just wondering where exactly the line lies between maximizing speed and
>adding knowledge. Considering the following hypothetical engines:
>engine A - spends 10% of its time in eval
>engine B - spends 67% of its time in eval
>Engine A will have a 3x higher nps, so it will do slightly more than 1/2 ply
>extra. Engine B will be spending 20x longer evaluating each position. It seems
>that engine A will gain around 50 rating points from its deeper searches (at
>least at time controls long enough that engine B can also make it to 8 or 9
>ply). It's hard to believe that a few accurate eval computations couldn't
>compensate for this.
>This would suggest that generally speaking it's quite acceptable to have some
>expensive computations in the eval, provided you stay within some sort of an
>acceptable limit, say not more than two-thirds of total search time.
>"True, an evaluation which is both accurate and fast is ideal
>It would be interesting to know the exact profile of the time spent in eval of
>the top programs. As I rememember Amir said that Junior spends "less than 10% in
>most positions". Does this mean a lot of lazy evals and some very big evals? Or
>a truly small evaluation which pinpoints the most important features of the
>position and ignores the rest.
>In many cases, you arrive in eval needing only to show that the non-material
>factors can't add up to more than X. (ie X = material score - alpha, or beta -
>material score.) As X gets higher and higher, more and more things could be
>ignored. Maybe some sort of a pinpointed lazy eval is the answer.
>It seems to me that the areas where an engine can get the most benefit are:
>A) Guiding search (ie reducing/extending): very very high
>B) Quality of eval: very high
>C) Overall speed, including speed of eval: medium/low, unless very big factors
>are involved, such as building specialized hardware or getting a chance to run
>on a big mp machine
>On the other hand, "A" and "B" are quite vague, you never really know if what
>you did was an improvement, while "C" is very easy to measure.
>It seems to me that there is a way to deal with this.
>Let's say you're at node A going to node B. You can statically evaluate node A,
>statically evaluate the move (A->B) in the context of node A, apply the
>appropriate reduction/extension, and enter node B with the already
>extended/reduced search depth. In this case, node B will go into the hash table
>with that new remaining search depth.
>If I understand the issue correctly, you'd also like to statically evaluate node
>B before deciding on the reduction/extension for move (A->B). However, I am not
>sure that that is necessary.
>Let's consider the typical node which is a candidate for reduction. It should
>have one of two qualities: A) it is extremely static B) it looks unpromising, ie
>score somewhat below alpha.
>In the case of very static nodes, it's pretty clear that you shouldn't need to
>also look at the next move/node. In the case of an unpromising node, I was
>thinking that you could do something like identify the few factors which could
>change the evaluation. For example, if you're down two pawns but there are some
>vague attacking prospects, then king safety would be that factor, and moves that
>attack the king would be extended/not reduced, while moves which attack pawns on
>the queenside are reduced. If you have an advanced passed pawn, then pushing
>that pawn or supporting its advance would be the one way to bring the score
>back, and moves which support that would be extended/not reduced. In all of
>these cases, it seems to me to be enough to only consider the original node
>(node A) and the moves themselves. The static evaluation of the next node can
>then be done already knowing the new depth.
>Unfortunately I haven't started to implement this, so there might be some issues
>At any rate, any reports on your experience in implementing this are welcome.
>There appears to be almost nothing in the chess literature about this. The best
>I found was Ed Schroeder's description of the pruning in Rebel, but he
>concentrates on pruning positions with very bad scores rather than on pruning
>irrelevant moves.
>It does also seem to me that this is "the key" to a good program, or at least
>that it could be "a key".
>I don't know how much luck you had with the initial debugging of your program.
>In my case I had to do some really tedious debugging involving stepping through
>the tree node-by-node. One quite funny bug I had was occurring in the search
>after 1. e4 e5. My program was analyzing the nonsense variation 2. Nf3 Nc6 3. d4
>Bb4+ 4. Bd2 Nf6 5. Nxe5 Nxe4 6. Nxc6 Nxd2 7. Nxd8 Nxf1+ - and crashing.
>Actually, from a computer chess point of view, this is a perfectly normal
>variation I think, I doubt that there is any sense in trying to prune it.
>Basically, if you're going to search several million nodes, you will look at a
>lot of junk. The point of a smart search, as I see it, is to try to somehow
>increase the relevance of the variations very very slightly, maybe into the 0.1%
>range, rather than 0.05%. This means that you can invest some of your available
>nodes into some various "lottery tickets" variations, ie. probably won't work,
>but you never know.
>Re. Qxf7, as you pointed out, the investment won't be so high because you will
>be failing low throughout the entire resulting search. It will be even cheaper
>than that because:
>1) It will fail low right after an immediate null move.
>2) It removes material from the board, further reducing the branching factor for
>that variation.
>3) It creates a big material imbalance so you may be able to A) use lazy eval in
>the subtree B) make some reductions later in the tree once things quiet down.
>(For example, a quiet move with a big material deficit can be a reduction
>4) There are only two replies, again reducing the cost.
>It seems that a reduction/extension mechanism shouldn't be trying to figure out
>which moves are actually good. It should be trying to figure out which moves are
>worth searching, based on:
>1) The very small chance that those moves will eventually be in the 0.001% of
>moves which are either part of the PV or force a change in the PV, rather than
>one of the 99.999% which could have been skipped.
>2) The expected cost of the search.
>"I more or less agree with this, but there is one caveat: when you first begin,
>you have to decide whether to use bitboards or not. If you don't, you'll never
>switch to them without a complete rewrite. Based on my experience so far I would
>strongly recommend using bitboards. They are not that complicated, they were not
>so difficult to debug (as I had slightly feared), and quite a number of
>evaluation terms can be very elegantly & efficiently expressed with them.
>For example, I use a bunch of precomputed masks to test for various pawn & piece
>configuration around the king. I have offline code which precomputes each of
>these masks, and creates a .h file of "const BitBoard weak_sq_g4_p = 0x..", and
>when each of these is used in eval the compiler will substitute it at compile
>"I have the theory that the greater your search resources (ie combination of
>and hardware), the less important is the search, and the more important is the
>evaluation. I have played around quite a bit with various extensions recently
>and they usually are quite good in testsuites, but not at all clear in real
>games. It's very easy to have side effects from extensions. At short time
>controls, you need to use extensions and reductions in order to pay special
>attention to tactics, to avoid getting killed tactically. At some point, though,
>you can search enough nodes that the tactics are naturally taken care of by a
>plain search, which also naturally takes care of good positional play.
>Of course a good clever (ie extending & reducing) search must always be better.
>Just something to keep in mind - in the future, hardware will improve, and the
>most important games will always have longer time controls and bigger hardware.
>There's much more there too -- I just selected teh more interesting (for me)


aahhhh - it's always interesting to dig up old notes.

Just a little disclaimer - posting here is a form of thinking. In other words,
what is posted here is from people who feel that they don't yet understand the
topic. As soon as they do, the posting stops :-)


This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.