Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: DB will never play with REBEL, they simple are afraid no to do well

Author: Robert Hyatt
Date: 18:11:05 10/14/99
On October 14, 1999 at 20:07:08, Ratko V Tomic wrote:

>>That is simply _totally_ wrong.  Using micros, we decide what we want to eval,
>>we try it, and decide whether the gain from the knowledge is worth the cost in
>>search speed/depth.  Hsu does the same, except rather than having to choose
>>whether to use it or not due to speed, he decides whether to use it or
>>not based on whether he wants to design the hardware to handle that.
>>
>>_EXACTLY_ the same issue, just in a slightly different context.
>
>Well, same if looked from high up enough and you see generic choices,
>prototyping and decision, implementation. That far, they're similar. But a micro
>programmers have a more stringent reality check in the five or ten other guys,
>as smart and creative or better, competing on the same hardware, same tools. So,
>if they're to be the best, their choices have to be the best. DB team has no
>real competition, i.e. if they lose to the human champion, they can just say,
>well, we did the best with the given means (and no genuine test is there to tell
>us whether that was so, it's only their perfectly expected belief), but the
>champion was much too strong. For a micro guy, there is always a genuine test,
>falsity or less than top quality will fail in the competition on equal footing.
>


the DB guys competed with everyone else from 1986 on, until the ACM events
ended in 1995.  They were keenly aware of what everyone was doing, and what
they had to do to stay on top.  They did..


>DB team situation is more like former Soviet industry (not so surprising for
>something out of IBM), no genuine competition, no real check, other than
>self-imposed one as it suits them (note especially the restrictions on DB or DB
>Jr. play against other programs; the general avoidance of genuine tests). We all
>know how did Soviet industry & economy eventually fare when faced with genuine
>reality check.
>


Exactly _what_ would you call two matches against the strongest human playing
chess?  "no real check?"  That statement totally goes over my head.  They had
a better 'check' than anyone else.  IE all we have for the commercial programs
is the inbreeding of ratings from computer vs computer on the SSDF, which is
the most quoted source of commercial program strength measurements.  The DB
guys went _far_ beyond that...


>I do think that if the top commercial programs were to get the same speed
>hardware as DB, at least couple of them (Hiarcs and Rebel, at least) would be
>stronger against the top humans than the DB. Unfortunately, this is at present
>untestable belief, but if DB ever gives a match chance to micros, we may be able
>to extrapolate some strength relation adjusted for the hardware.


You simply don't understand the issue.  They could _not_ be "as fast" as
deep blue, unless they _became_ deep blue.  Because DB's eval is hardware. To
keep the speed up.  The micro would have to use that eval.  DB's search is
in the hardware to keep the speed up.  To get that speed, they would have to
use DB's search.  What do you end up with?  DB.

The speed issues controlled a lot of what went in to deep blue.  The only
way any other program could go that fast would be to (a) wait for log2(2000)
years, the expected number of 'doublings' we need to get to their speed,
giving them credit for being 200x times faster in raw nps, and 10x more complex
in their eval than any other program.  log2(2000) is a good long while.  If we
see 2x per year, another eleven years must pass.  So far we aren't doubling
every year, more like every 18 months although it has slowed over the past year
or two to more like every 2 years or so.  that pushes it off 18-22 years.  All
you have to do is wait that long and we might be able to catch up.  Of course,
they could also re-do their hardware in 20 years and then they would be right
back at 2000 times faster.  An endless circle...



>
>
>> If you really
>>believe that others are more creative than the DB team, you are _sadly_
>>mistaken.
>>
>
>Well, if, say, you take Ed vs Hsu, we can say, without knowing the best ideas of
>their work, there is a 50% a priori chance Ed is more creative (in his chess
>programming work). And the same goes for all the top chess programmers. So, the
>odds that Hsu is more creative than, say, the top 5 micro programmers, are 1 in
>32, i.e. 3.1%. In other words, the odds that at least some top micro programmers
>are more creative than Hsu are about 97%, which supports well my hypothesis
>(arrived at another way).
>



except your original assumption is a fallacy.  50% isn't a reasonable
estimate.  And there is no 'probability' in the equation at all.



>
>>
>>A good eval doesn't have opening positions that it can't handle.  My program
>>now plays fianchetto openings quite well, yet it didn't 3 years ago.  Because
>>it now has a better understanding of what is required.  My favorite GM is
>>trying to break it on both sides of the Kings Indian, to see what it can't
>>follow very well.  So far, as black, it is crushing him every game.
>>
>Does 16.6 version (included as engine with CB programs) have this code? I might
>try its fianchetto handling. Among other fianchetto problems, e.g. Fritz and
>Hiracs will give their fianchetto bishop for a Knight and a pawn early in the
>middlegame. Sometimes they'll even exchange it for opponents centralized bishop.
>

Exchanging the fianc bishop for a knight + pawn is not a bad idea.  exchanging
it for a knight is not a good idea.  But yes, the code has been in crafty for
3 years at least...




>
>>>
>>>Another problem that opening may have for a program is that there may be some
>>>variation few moves beyond the book where deeper (but by necessity inexact)
>>>evaluation misleads the program. It is an analog of opening traps in human
>>>games, except that here the poisoned gift isn't an offer of a seemingly free
>>>piece one or two moves ahead, but it may be some tricky combination 14 plies
>>>deep that nobody ever thought of. In human oriented opening thery much of such
>>>stuff is invisible (unknown even to specialists), like underwater rocks, waiting to sink a brute searcher which discovers the "gaining" combination.
>>
>>hmmm.. I have seen 14 ply searches from crafty in the middlegame.  I have
>>seen DB go incredibly deep, with PVs well over 40 plies long.  I doubt that
>>will be a serious problem to them.
>>
>>
>I think there was a bit of misunderstanding here. I wasn't saying that programs
>won't see something 14 plies deep, but that they _will_ see some apparent 14 ply
>gain, which actually is a poisoned gain (of the same kind as the more obvious
>gain in the well known opening traps), which was always in that opening, but
>human opening theory never warned about them, since humans normally wouldn't see
>some long tricky combination to "win" a poisoned pawn. So these are hidden (from
>human eye) traps, waiting for the programs to step into.
>
>>>searcher performs systematically worse, the deeper it goes. Some examples of
>>>such "pathology" (as it is called) are given in the book "Heuristics" by Judea
>>>Pearl, 1994 Addison-Wesley, ISBN 0-201-05594-5 (in chapter 10.1 and 10.2, >>pages
>>>346-360). The main cause of the patology in the example (Nau's board splitting
>>>game) was the evaluation error propagation, which made the error larger as the
>>>values are backed up the tree.
>>>
>>
>>
>>I don't think you will find that true of current programs.  Maybe a rare
>>exception position here and there, but _very rare_.  And very fixable when we
>>find them.
>>
>>
>Well, in match of two identical programs set to search at N and N+1, the
>shallower one will win some good percentage of games, maybe 20-30%. So there
>must have been good number of positions were _ultimately_ the deeper search
>found a substantially worse move (a losing move) than the shallower one. Namely,
>it may true that the evaluation will say such positions are rare, but the
>ultimate judge is not the (fundamentally inaccurate) evaluation function but the
>final outcome. So, there is no question that such positions exist and their
>number is not at all vanishingly small. Therefore there may be a general
>strategy (including openings) which could guide present-day alpha-beta searchers
>into a sample of games biased in favor of such positions. Our present-day chess
>trategy is simply not geared toward this objective, and programs are appearing
>stronger than they actually will turn out to be.
>
>
>>
>>that is why we have the quiescence search... to remove much of the dynamic
>>stuff so that the position _is_ quiet when we evaluate it.  This doesn't seem
>>to be much of a problem today.
>>
>
>The quiescence does bring part of the conditions of evaluation closer to each
>other for different nodes, but still doesn't resolve the problem of estimating
>the evaluation error and its dependence on the position. Dynamic element isn't
>nearly exhausted by the immediate captures and checks. The expected evaluation
>function error would be a function of other evaluation parameters, plus other
>position parameters related to dynamic elements (even after adjustment for given
>type of quiescence search), which may not be relevant (or useful enough in
>evaluation role to be widely used) for the evaluation function itself.
>
>As I described in another note in this thread, it may be useful to obtain an
>error estimate by collecting the statistics as a byproduct of the search, e.g.
>if the node search order happens to be "wrong" several times (as it always
>does), that produces several full values (as opposed to bounds) for these nodes,
>and the spread of such values is an indicator of the error margin and the
>volatility of the position.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.