Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Tiger against Deep Blue Junior: what really happened.

Author: Christophe Theron

Date: 11:40:49 07/27/00

Go up one level in this thread


On July 27, 2000 at 09:00:30, Robert Hyatt wrote:

>On July 27, 2000 at 02:51:09, Christophe Theron wrote:
>
>>On July 26, 2000 at 22:27:05, Robert Hyatt wrote:
>>
>>>On July 25, 2000 at 15:24:04, Alvaro Polo wrote:
>>>
>>>>On July 25, 2000 at 06:23:51, Ralf Elvsén wrote:
>>>>
>>>>>On July 25, 2000 at 05:05:44, blass uri wrote:
>>>>>
>>>>>>>
>>>>>>>Alvaro
>>>>>>
>>>>>>Your calculation is wrong because of diminishing return from speed.
>>>>>>
>>>>>>Uri
>>>>>
>>>>>Right or wrong belongs to pure mathematics. Here we need an estimation
>>>>>of the uncertainty. If a result is in the right neighbourhood
>>>>>it's usable.
>>>>>
>>>>>Ralf
>>>>
>>>>I am going to modify my "I am wrong" assessment. DBJ was making 750,000 nodes
>>>>per search, and CT 375,000 nodes per search, but DBJ was using only 1 second and
>>>>CT 22 secs per search. This difference compensates the weak CPU being used by
>>>>CT. I hence believe that this is equivalent to DBJ against CT (under a powerful
>>>>P3) if both were using the same time per search (DBJ using equal time
>>>>compensates the P3-Pentium 150Mhz difference). Then the full DB, at 200Mnps
>>>>rather than 750Knps would be about 560 Elo higher than CT on a modern machine,
>>>>assuming that diminishing returns don't affect comp-comp matches, something, on
>>>>the other hand, that has never been proven wrong.
>>>>
>>>>Alvaro
>>>
>>>I don't want to go deeper into the argument, but I can offer better numbers.
>>>
>>>WebDB was supposedly using one chip according to Hsu.  Which would probably
>>>be one of the later chips at 2.4M nodes per second.  At 1/4 of a second for
>>>downloading eval terms, that leaves .75 * 2.4M = 1.8M nodes per second.
>>>
>>>He said other things were intentionally broken (no repetition as it had to
>>>be stateless to work on the web) and many extensions were 'effectively'
>>>turned off as they don't enable them for very quick searches for whatever
>>>reason...
>>>
>>>But I'd definitely go with 1.8M nodes per second as an upper bound, 1.5M
>>>as a lower bound (assuming a 20mhz chess chip).
>>
>>
>>
>>That makes the picture only worse for poor DBJr.
>>
>>Chess Tiger was computing in average 375,000 nodes each time it had to play a
>>move.
>>
>
>DB _only_ looked at 1.5M moves _total_ for each move it played.  I thought
>you were searching much longer.



I already took into account my NPS and average search time.

Chess Tiger 11.9 (Paderborn version) on the P150 was computing 25,000 nodes per
second. It was allowed to use 15 seconds per move in average (that's how my time
allocation works in this case).


  15 x 25,000 = 375,000






>Also, remember Hsu's words about "extensions were effectively disabled" due
>to the very short search times.  I know how _my_ program would play with
>all the extensions turned off.  I have done this in testing.



And I know how my program would play without extensions. It would never spoil a
4 times advantage in number of nodes.







>>DBJr was computing 1.5 MILLION nodes (or more) each time it had to play a move
>>(your own numbers).
>>
>>That means that DBJr had a computational advantage of 4 times, at least, in
>>number of nodes.
>>
>
>Yes.. but didn't you use more than 1 second?  It only used 3/4 second of
>computation for each move it played.  I thought you were using 30 seconds
>or some such?




Not at all. See above.





>>We have heard many times that DB's evaluation was made of thousands of terms
>>(wasn't it 8,000 terms or so?), that this evaluation has been tuned by
>>grandmasters, and that, as it was built in the chips, they could compute
>>everything they wanted for free.
>>
>>So the microcomputer programs are supposed to have an inferior evaluation
>>function, at least that is what the propaganda machinery told us.
>>
>>If it is really the case, with a computational advantage of 4 times in number of
>>nodes and your superior evaluation function, then you are supposed to CRUSH your
>>micro opponent, isn't it?
>
>Who knows.  It _did_ in 1997.  Hsu said WebDB was crippled in several ways.
>Which makes it hard for me to evaluate its play at all.




I understand that you are not willing to understand the implications of the
Chess Tiger vs DBJr and Rebel vs DBJr match.






>>You crush it because:
>>1) in every position you look at your evaluation is better than your opponent's
>>2) you can look at many more positions than your opponent
>>
>>But it simply did not happen. There is the 1.5-1.5 result against Tiger (you can
>>even count a 2-1 victory for DBJr if you want). And don't forget that, with a
>>faster notebook, Rebel won 3-0.
>>
>>
>>So were is the bug?
>>
>>
>>Using the excuse that the evaluation function was weaker than the real DB is
>>just an insult to us chess programmers. Maybe the excuse works for people that
>>have no chess programming experience, but it does not work with me.
>>
>
>
>Can't do anything about that.  I posted a direct quote from Hsu that he wrote
>when I asked him about the thing playing via the web.  You saw the answer.  If
>you choose to not believe him, there is nothing I can do.  Personally, having
>known him for (now) 13 years or so, I trust him.




I didn't say I don't trust him.

I'm ready to believe that its evaluation was dumbed down a little bit.

It's just that it should have been dumbed down dramatically to spoil its 4 times
speed advantage.

If it had been dumbed down so much, you would see it in the games. You would see
obvious positional mistakes. Just look at the games and you will see that it is
not the case.

So it was maybe not the full evaluation, but I just cannot believe that it
explains why Tiger has not been crushed.





>>You can do the following experiment: take your chess program and weaken its
>>evaluation by just taking into account the material balance, the pawn structure,
>>and centralization of pieces (with very simple piece square tables). Allow this
>>weakened version to compute 4 times the number of nodes that the normal version
>>is allowed to use, and let them play against each other. The "weak" version will
>>win by a large margin, because the 4 times handicap in number of nodes is simply
>>overwhelming.
>>
>
>
>That isn't true at all.  We have had such programs in the ACM events for
>many years.  They generally did "OK" but _never_ came close to beating the
>best programs nor winning the tournaments.  A factor of 10 might make it
>more interesting.  And a factor of 100 might _really_ get to be serious..




Fritz3 winning in 1995. Remember?






>>What I'm saying is that weakening the evaluation function is not enough to spoil
>>a 4 times advantage in number of nodes, unless of course you tell your program
>>that a rook is worth 2 pawns and a queen is worth 0.5 pawn.
>
>
>There we have to agree to disagree.  I can easily run the experiment by putting
>a return; at the right place in my eval.  I have done this more than once over
>the past 5 years, just for fun..  I never had the 'dumb' version win a match.
>Nor many games.



I have done so already.

Run the experiment and tell us about your results...

That's a basic of computer chess programming. It takes to add or remove a lot of
"knowledge" in the evaluation to overcome a 4 times handicap/advantage in speed.

Hsu & co know that perfectly, and that's why they have put all their efforts in
search speed.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.