Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Tiger against Deep Blue Junior: what really happened.

Author: Robert Hyatt
Date: 12:09:01 07/27/00
On July 27, 2000 at 14:40:49, Christophe Theron wrote:

>On July 27, 2000 at 09:00:30, Robert Hyatt wrote:
>
>>On July 27, 2000 at 02:51:09, Christophe Theron wrote:
>>
>>>On July 26, 2000 at 22:27:05, Robert Hyatt wrote:
>>>
>>>>On July 25, 2000 at 15:24:04, Alvaro Polo wrote:
>>>>
>>>>>On July 25, 2000 at 06:23:51, Ralf Elvsén wrote:
>>>>>
>>>>>>On July 25, 2000 at 05:05:44, blass uri wrote:
>>>>>>
>>>>>>>>
>>>>>>>>Alvaro
>>>>>>>
>>>>>>>Your calculation is wrong because of diminishing return from speed.
>>>>>>>
>>>>>>>Uri
>>>>>>
>>>>>>Right or wrong belongs to pure mathematics. Here we need an estimation
>>>>>>of the uncertainty. If a result is in the right neighbourhood
>>>>>>it's usable.
>>>>>>
>>>>>>Ralf
>>>>>
>>>>>I am going to modify my "I am wrong" assessment. DBJ was making 750,000 nodes
>>>>>per search, and CT 375,000 nodes per search, but DBJ was using only 1 second and
>>>>>CT 22 secs per search. This difference compensates the weak CPU being used by
>>>>>CT. I hence believe that this is equivalent to DBJ against CT (under a powerful
>>>>>P3) if both were using the same time per search (DBJ using equal time
>>>>>compensates the P3-Pentium 150Mhz difference). Then the full DB, at 200Mnps
>>>>>rather than 750Knps would be about 560 Elo higher than CT on a modern machine,
>>>>>assuming that diminishing returns don't affect comp-comp matches, something, on
>>>>>the other hand, that has never been proven wrong.
>>>>>
>>>>>Alvaro
>>>>
>>>>I don't want to go deeper into the argument, but I can offer better numbers.
>>>>
>>>>WebDB was supposedly using one chip according to Hsu.  Which would probably
>>>>be one of the later chips at 2.4M nodes per second.  At 1/4 of a second for
>>>>downloading eval terms, that leaves .75 * 2.4M = 1.8M nodes per second.
>>>>
>>>>He said other things were intentionally broken (no repetition as it had to
>>>>be stateless to work on the web) and many extensions were 'effectively'
>>>>turned off as they don't enable them for very quick searches for whatever
>>>>reason...
>>>>
>>>>But I'd definitely go with 1.8M nodes per second as an upper bound, 1.5M
>>>>as a lower bound (assuming a 20mhz chess chip).
>>>
>>>
>>>
>>>That makes the picture only worse for poor DBJr.
>>>
>>>Chess Tiger was computing in average 375,000 nodes each time it had to play a
>>>move.
>>>
>>
>>DB _only_ looked at 1.5M moves _total_ for each move it played.  I thought
>>you were searching much longer.
>
>
>
>I already took into account my NPS and average search time.
>
>Chess Tiger 11.9 (Paderborn version) on the P150 was computing 25,000 nodes per
>second. It was allowed to use 15 seconds per move in average (that's how my time
>allocation works in this case).
>
>
>  15 x 25,000 = 375,000
>
>
>
>
>
>
>>Also, remember Hsu's words about "extensions were effectively disabled" due
>>to the very short search times.  I know how _my_ program would play with
>>all the extensions turned off.  I have done this in testing.
>
>
>
>And I know how my program would play without extensions. It would never spoil a
>4 times advantage in number of nodes.


I would take that bet.  A year ago I ran a big experiment, since my search
extensions are all tunable.  I ran matches with each extension set to zero,
then each was separately incremented by .25, and I played everybody against
everybody.  the 0/.../0 extension program got totally destroyed.  It was
even beaten by versions that were way over-extending.  It fell for too many
tactical traps...




>
>
>
>
>
>
>
>>>DBJr was computing 1.5 MILLION nodes (or more) each time it had to play a move
>>>(your own numbers).
>>>
>>>That means that DBJr had a computational advantage of 4 times, at least, in
>>>number of nodes.
>>>
>>
>>Yes.. but didn't you use more than 1 second?  It only used 3/4 second of
>>computation for each move it played.  I thought you were using 30 seconds
>>or some such?
>
>
>
>
>Not at all. See above.
>
>
>
>
>
>>>We have heard many times that DB's evaluation was made of thousands of terms
>>>(wasn't it 8,000 terms or so?), that this evaluation has been tuned by
>>>grandmasters, and that, as it was built in the chips, they could compute
>>>everything they wanted for free.
>>>
>>>So the microcomputer programs are supposed to have an inferior evaluation
>>>function, at least that is what the propaganda machinery told us.
>>>
>>>If it is really the case, with a computational advantage of 4 times in number of
>>>nodes and your superior evaluation function, then you are supposed to CRUSH your
>>>micro opponent, isn't it?
>>
>>Who knows.  It _did_ in 1997.  Hsu said WebDB was crippled in several ways.
>>Which makes it hard for me to evaluate its play at all.
>
>
>
>
>I understand that you are not willing to understand the implications of the
>Chess Tiger vs DBJr and Rebel vs DBJr match.
>
>


I don't see any implications any more than I see implications about my
program when I play it on ICC and it has something broken and gets into
trouble a lot.  Or when it plays with a big hardware advantage.  Or
disadvantage..





>
>
>
>
>>>You crush it because:
>>>1) in every position you look at your evaluation is better than your opponent's
>>>2) you can look at many more positions than your opponent
>>>
>>>But it simply did not happen. There is the 1.5-1.5 result against Tiger (you can
>>>even count a 2-1 victory for DBJr if you want). And don't forget that, with a
>>>faster notebook, Rebel won 3-0.
>>>
>>>
>>>So were is the bug?
>>>
>>>
>>>Using the excuse that the evaluation function was weaker than the real DB is
>>>just an insult to us chess programmers. Maybe the excuse works for people that
>>>have no chess programming experience, but it does not work with me.
>>>
>>
>>
>>Can't do anything about that.  I posted a direct quote from Hsu that he wrote
>>when I asked him about the thing playing via the web.  You saw the answer.  If
>>you choose to not believe him, there is nothing I can do.  Personally, having
>>known him for (now) 13 years or so, I trust him.
>
>
>
>
>I didn't say I don't trust him.
>
>I'm ready to believe that its evaluation was dumbed down a little bit.
>
>It's just that it should have been dumbed down dramatically to spoil its 4 times
>speed advantage.
>
>If it had been dumbed down so much, you would see it in the games. You would see
>obvious positional mistakes. Just look at the games and you will see that it is
>not the case.
>
>So it was maybe not the full evaluation, but I just cannot believe that it
>explains why Tiger has not been crushed.
>
>
>
>
>
>>>You can do the following experiment: take your chess program and weaken its
>>>evaluation by just taking into account the material balance, the pawn structure,
>>>and centralization of pieces (with very simple piece square tables). Allow this
>>>weakened version to compute 4 times the number of nodes that the normal version
>>>is allowed to use, and let them play against each other. The "weak" version will
>>>win by a large margin, because the 4 times handicap in number of nodes is simply
>>>overwhelming.
>>>
>>
>>
>>That isn't true at all.  We have had such programs in the ACM events for
>>many years.  They generally did "OK" but _never_ came close to beating the
>>best programs nor winning the tournaments.  A factor of 10 might make it
>>more interesting.  And a factor of 100 might _really_ get to be serious..
>
>
>
>
>Fritz3 winning in 1995. Remember?
>
>
>
>
>
>
>>>What I'm saying is that weakening the evaluation function is not enough to spoil
>>>a 4 times advantage in number of nodes, unless of course you tell your program
>>>that a rook is worth 2 pawns and a queen is worth 0.5 pawn.
>>
>>
>>There we have to agree to disagree.  I can easily run the experiment by putting
>>a return; at the right place in my eval.  I have done this more than once over
>>the past 5 years, just for fun..  I never had the 'dumb' version win a match.
>>Nor many games.
>
>
>
>I have done so already.
>
>Run the experiment and tell us about your results...
>
>That's a basic of computer chess programming. It takes to add or remove a lot of
>"knowledge" in the evaluation to overcome a 4 times handicap/advantage in speed.
>
>Hsu & co know that perfectly, and that's why they have put all their efforts in
>search speed.
>
>
>
>    Christophe
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.