Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Tiger against Deep Blue Junior: what really happened.

Author: Robert Hyatt

Date: 06:00:30 07/27/00

Go up one level in this thread


On July 27, 2000 at 02:51:09, Christophe Theron wrote:

>On July 26, 2000 at 22:27:05, Robert Hyatt wrote:
>
>>On July 25, 2000 at 15:24:04, Alvaro Polo wrote:
>>
>>>On July 25, 2000 at 06:23:51, Ralf Elvsén wrote:
>>>
>>>>On July 25, 2000 at 05:05:44, blass uri wrote:
>>>>
>>>>>>
>>>>>>Alvaro
>>>>>
>>>>>Your calculation is wrong because of diminishing return from speed.
>>>>>
>>>>>Uri
>>>>
>>>>Right or wrong belongs to pure mathematics. Here we need an estimation
>>>>of the uncertainty. If a result is in the right neighbourhood
>>>>it's usable.
>>>>
>>>>Ralf
>>>
>>>I am going to modify my "I am wrong" assessment. DBJ was making 750,000 nodes
>>>per search, and CT 375,000 nodes per search, but DBJ was using only 1 second and
>>>CT 22 secs per search. This difference compensates the weak CPU being used by
>>>CT. I hence believe that this is equivalent to DBJ against CT (under a powerful
>>>P3) if both were using the same time per search (DBJ using equal time
>>>compensates the P3-Pentium 150Mhz difference). Then the full DB, at 200Mnps
>>>rather than 750Knps would be about 560 Elo higher than CT on a modern machine,
>>>assuming that diminishing returns don't affect comp-comp matches, something, on
>>>the other hand, that has never been proven wrong.
>>>
>>>Alvaro
>>
>>I don't want to go deeper into the argument, but I can offer better numbers.
>>
>>WebDB was supposedly using one chip according to Hsu.  Which would probably
>>be one of the later chips at 2.4M nodes per second.  At 1/4 of a second for
>>downloading eval terms, that leaves .75 * 2.4M = 1.8M nodes per second.
>>
>>He said other things were intentionally broken (no repetition as it had to
>>be stateless to work on the web) and many extensions were 'effectively'
>>turned off as they don't enable them for very quick searches for whatever
>>reason...
>>
>>But I'd definitely go with 1.8M nodes per second as an upper bound, 1.5M
>>as a lower bound (assuming a 20mhz chess chip).
>
>
>
>That makes the picture only worse for poor DBJr.
>
>Chess Tiger was computing in average 375,000 nodes each time it had to play a
>move.
>

DB _only_ looked at 1.5M moves _total_ for each move it played.  I thought
you were searching much longer.

Also, remember Hsu's words about "extensions were effectively disabled" due
to the very short search times.  I know how _my_ program would play with
all the extensions turned off.  I have done this in testing.




>DBJr was computing 1.5 MILLION nodes (or more) each time it had to play a move
>(your own numbers).
>
>That means that DBJr had a computational advantage of 4 times, at least, in
>number of nodes.
>

Yes.. but didn't you use more than 1 second?  It only used 3/4 second of
computation for each move it played.  I thought you were using 30 seconds
or some such?




>We have heard many times that DB's evaluation was made of thousands of terms
>(wasn't it 8,000 terms or so?), that this evaluation has been tuned by
>grandmasters, and that, as it was built in the chips, they could compute
>everything they wanted for free.
>
>So the microcomputer programs are supposed to have an inferior evaluation
>function, at least that is what the propaganda machinery told us.
>
>If it is really the case, with a computational advantage of 4 times in number of
>nodes and your superior evaluation function, then you are supposed to CRUSH your
>micro opponent, isn't it?

Who knows.  It _did_ in 1997.  Hsu said WebDB was crippled in several ways.
Which makes it hard for me to evaluate its play at all.


>
>You crush it because:
>1) in every position you look at your evaluation is better than your opponent's
>2) you can look at many more positions than your opponent
>
>But it simply did not happen. There is the 1.5-1.5 result against Tiger (you can
>even count a 2-1 victory for DBJr if you want). And don't forget that, with a
>faster notebook, Rebel won 3-0.
>
>
>So were is the bug?
>
>
>Using the excuse that the evaluation function was weaker than the real DB is
>just an insult to us chess programmers. Maybe the excuse works for people that
>have no chess programming experience, but it does not work with me.
>


Can't do anything about that.  I posted a direct quote from Hsu that he wrote
when I asked him about the thing playing via the web.  You saw the answer.  If
you choose to not believe him, there is nothing I can do.  Personally, having
known him for (now) 13 years or so, I trust him.



>You can do the following experiment: take your chess program and weaken its
>evaluation by just taking into account the material balance, the pawn structure,
>and centralization of pieces (with very simple piece square tables). Allow this
>weakened version to compute 4 times the number of nodes that the normal version
>is allowed to use, and let them play against each other. The "weak" version will
>win by a large margin, because the 4 times handicap in number of nodes is simply
>overwhelming.
>


That isn't true at all.  We have had such programs in the ACM events for
many years.  They generally did "OK" but _never_ came close to beating the
best programs nor winning the tournaments.  A factor of 10 might make it
more interesting.  And a factor of 100 might _really_ get to be serious..







>What I'm saying is that weakening the evaluation function is not enough to spoil
>a 4 times advantage in number of nodes, unless of course you tell your program
>that a rook is worth 2 pawns and a queen is worth 0.5 pawn.


There we have to agree to disagree.  I can easily run the experiment by putting
a return; at the right place in my eval.  I have done this more than once over
the past 5 years, just for fun..  I never had the 'dumb' version win a match.
Nor many games.



>
>So I'm not going to buy the "weak evaluation" excuse.
>
>The same for being stateless. This is a problem for the evaluation of draw by
>repetitions and 50 moves rule, but it is not relevant in the games I have
>posted.
>
>Or the problem is somewhere else. I don't think not doing extensions is the
>problem (especially with the number of nodes advantage) but probably not doing
>any forward pruning is not a bright move...
>
>
>But anyway... As I said, I have just seen 2 chess programs fighting. I did not
>notice that one of them was so much smarter than the other one.
>
>My conclusion is that the only advantage of Deep Blue is its speed. I don't
>believe its evaluation is superior.
>
>And that the speed superiority is somehow spoiled by the lack of serious pruning
>schemes and by the use of the useless flashy "singular extension".
>
>Microcomputer programs have serious chances against one DB chip.
>
>Of course, if Hsu produced new chips with state of the art technology, they
>would run much faster and the situation would be changed. But is this going to
>happen?
>
>
>
>    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.