Author: Christophe Theron
Date: 11:02:01 01/25/00
Go up one level in this thread
On January 24, 2000 at 23:29:07, Robert Hyatt wrote:
>On January 24, 2000 at 16:10:20, Christophe Theron wrote:
>
>>On January 24, 2000 at 09:21:32, Robert Hyatt wrote:
>>
>>>On January 23, 2000 at 22:56:04, Christophe Theron wrote:
>>>
>>>>On January 23, 2000 at 03:35:35, Bruce Moreland wrote:
>>>>
>>>>>On January 23, 2000 at 02:51:55, Amir Ban wrote:
>>>>>
>>>>>>The results can be disregarded on these grounds of course, but it's also true
>>>>>>that the results, as reported, can be dismissed as being in contradiction to the
>>>>>>DB/DT public record, and to common sense in general.
>>>>>
>>>>>Here are some ideas about what might have happened in those games:
>>>>>
>>>>>1) DB Jr may have beaten those programs purely through eval function
>>>>>superiority.
>>>>>
>>>>>2) It may have won because of superior search.
>>>>>
>>>>>3) There may have been a poor comparison between node rates, resulting in DB Jr
>>>>>having a massive hardware advantage.
>>>>>
>>>>>4) The whole thing may be ficticious.
>>>>>
>>>>>5) Random chance.
>>>>>
>>>>>6) Something I haven't thought of yet.
>>>>>
>>>>>Bob may go nuts because I included #4. I don't believe that #4 is true, but
>>>>>someone can always claim that it is, and there is no obvious evidence that can
>>>>>be used to refute this claim, which disadvantages us who want to understand this
>>>>>rather than argue religion and conspiracies all day.
>>>>>
>>>>>#1 is what we are expected to believe, I thought that is what this test was
>>>>>supposed to measure. I have a very hard time with this one. I don't believe
>>>>>there are any terms that in and of themselves would result in such a lopsided
>>>>>match. I don't believe that I could set up my program to search exactly a
>>>>>hundred million nodes per search, and play it against the best eval function I
>>>>>could possibly write, also searching a hundred million nodes per search, and
>>>>>score 38-2.
>>>>
>>>>
>>>>I totally agree with you here.
>>>>
>>>>
>>>>
>>>>>Could I be convinced that #1 is true? You bet! Will I accept that #1 is true
>>>>>based upon faith in the reputations of Hsu and Campbell? With all due respect,
>>>>>not a chance. I don't think anyone should be expected to be so trusting in a
>>>>>field that's even remotely scientific.
>>>>>
>>>>>It would also be hard to accept #2, since DB is supposedly not optimized for
>>>>>short searches. And I believe that you've ruled out #5, which seems a sensible
>>>>>thing to do.
>>>>
>>>>
>>>>I haven't followed the discussion, but on short searches, how many plies deeper
>>>>do you need to compute to get a 38-2 result?
>>>>
>>>>My guess is that 3 plies and a bit of luck would do it easily. You need to be
>>>>100 to 200 times faster than your opponent to achieve this (less if you have a
>>>>decent branching factor, but DB has not).
>>>>
>>>>I think this is a very easy experiment to do.
>>>>
>>>>DB is definitely optimized for short searches, if you think about it.
>>>>
>>>>It has the best NPS of all times, and probably one of the worse branching factor
>>>>you can imagine, because of this crazy singular extension and lack of null move
>>>>(or related) optimization.
>>>>
>>>>So I would say that compared to modern microcomputer programs it would perform
>>>>worse and worse as time control increases.
>>>>
>>>>
>>>>Maybe I missed something?
>>>>
>>>>
>>>> Christophe
>>>
>>>
>>>Their branching factor didn't look bad to me, knowing they don't do null-move.
>>>It seemed to stick between 5 and 6 most of the time, which is roughly normal
>>>for alpha/beta (it should average roughly sqrt(38) if there are 38 legal moves.)
>>
>>
>>Yes, I was not meaning it was so bad, but just worse than what you get with a
>>decent pruning system.
>>
>>Their branching factor might actually be worse in the very last plies (the ones
>>computed inside the chess chips) because they lack HT optimization there.
>>
>>Then the plies computed by software should have a better BF as they are using a
>>hash table there.
>>
>>So for a deep enough search the resulting BF might be as good as a good
>>alpha-beta ordered search can be, typically between 5 or 6 (and usually close to
>>5).
>>
>>However I'm wondering about the singular extension stuff. As I understand the
>>cost of detecting singular moves is linear (would not increase the branching
>>factor, just add a percentage to the total search time), but the cost of the
>>extension itself definitely increases the branching factor (increases the search
>>time exponentially).
>>
>>Of course I have no idea if it would be worse, in term of BF, than the set of
>>extensions microcomputers generally use.
>>
>>I think we can safely assume that their branching factor was above 5, and
>>probably significantly higher. And I did not even factor in the extra cost of
>>the parallel search.
>>
>
>
>we have enough data to calculate their branching factor, as we now have
>detailed output. ie from a few quick peeks at the round 1 DB log:
>
>iteration time
> 8 6
> 9 22
> 10 96
>
> 7 1
> 8 3
> 9 10
> 10 36
>
> 7 1
> 8 3
> 9 7
> 10 37
> 11 162
>
> 7 3
> 8 7
> 9 21
> 10 91
>
>So actualy their branching factor seems fairly reasonable. Note that the
>depths are the 'software depths' reported in the log files, not the full depth
>which should be N+4 with the hardware stuff added on...
>
>They aren't great, but they aren't really bad.
If you consider that they were not using any pruning scheme and were (probably)
extending a lot, the numbers are incredibly good.
If this sample is representative of DB's branching factor, then I have a
problem. I have worked *hard* to improve my move ordering at the time my program
was not selective at all, and I have never been close to this.
Maybe I have not been very good at this, but as I understand nobody has ever
been able to get a BF as low as DB's with a full width search and a lot of
extensions as they did.
Bob, don't you think it's strange?
>>>I don't think it would do "worse and worse". Any more than any other program
>>>would. Although it might do worse as depth decreases depending on what they
>>>did in their eval.
>>
>>
>>With such a "high" branching factor, you can expect to end up doing worse in
>>term of average ply depth than a low BF program.
>>
>>Of course, with their NPS, they start with a huge advantage. But if you draw the
>>curve of ply depth versus time for both DB and a modern commercial program, you
>>can expect DB's curve to be eventually reached by the commercial's curve.
>
>
>yes.. although their search will necessarily have fewer errors, since they don't
>rely on things like Null-move or forward pruning. Whether that is a good thing
>or not is debatable. But they don't have much of the fail-high / fail-low
>nonsense I see regularly.
Actually many programs prove everyday that using an imperfect pruning scheme is
much better than no pruning at all. Yours included (I don't mean your pruning is
imperfect).
I see no reason to believe that this would change just because you are be able
to compute 4, 5 or even 10 plies deeper.
Is it really still debatable? It was maybe, 20 years ago, but now we all know
the answer...
>>That's what I meant by "doing worse and worse". I could have written "doing less
>>and less good".
>>
>>Maybe I'm wrong because the singular extension stuff would compensate for this,
>>and the pruning system of commercial program would lose important information
>>than a pure alphabeta search. But I don't think so.
>>
>
>
>Impossible to say or predict. Hsu was a stickler for eliminating error,
>although he did mention that if he had done the PC version, he would have at
>least included the option as the code is trivial.
It is not that trivial because you have to think about all the consequences. I
think most chess programmers spend a huge percentage of their time in designing
and improving their selective algorithms. At least I do.
I was not really meaning that micro programs could beat Deep Blue. My main point
is that not using a pruning scheme is somewhat... stup... Ahem... not really
optimal.
>>My opinion is that Deep Blue is much stronger than micros just because it has a
>>huge NPS.
>>
>
>I think that is a big part of it. However, it is still capable of maintaining
>that huge speed advantage while doing anything they want in the eval. They
>were apparently quite interested in King Safety, and spent a lot of time
>working on that. At zero cost... where mine is significant...
>
>I think the micro matches were simply 'interesting' things to try. I did this
>several years ago with Cray Blitz vs Chess Genius on the fastest PC available
>in the 1993-4 timeframe. CB blew it out horribly even giving Genius a huge
>time handicap. And we had a long discussion about the games. Everyone agreed
>that the real problem was that CB had a pretty good king safety evaluation,
>and Genius had very little. And it kept getting attacked and shredded in every
>Game. Chris Whittington even guessed the identity of the program as I never
>told who. He guessed it after looking at the horrific attacks it was falling
>into...
>
>Hsu reported the same things about 1996-era commercial programs which definitely
>were light on king safety...
That can be part of the explanation. But in many openings there is no way to
built an overwhelming king attack, and you have to play positions where the
micros do not show weaknesses.
38-2 cannot be explained only by king safety.
>>But if you present things like this, it's not very sexy.
>>
>>So Hsu decided to use a totally different approach than the micro's.
>>
>>By not using a good known pruning system and introducing a new extension scheme
>>of your own, you present yourself as being a pionner. A genius that is so bright
>>that he has understood that what everybody else is considering as very good
>>(null move or similar recipes) is in fact rubbish. A guru that has invented a
>>bright human-like extension: the singular extension!
>>
>>In fact, Hsu is more a hardware and public relation genius than a technical
>>chess programmer genius.
>
>I wouldn't agree there at all. If you had ever met him and talked with him,
>that would not be a statement you would make...
At least he didn't show that he was a genius in chess programming. Was it? We
will never know.
I can't deny that designing the chips is a great achievement, and that Hsu is
certainly a wonderful hardware designer.
But I'm sorry, not using any pruning scheme, if it is really what they did,
sounds like a "political" decision. I cannot believe that Hsu is stupid enough
to really believe that DB plays better without pruning.
It is either a huge professional mistake or a deliberate public relations
choice. You guess.
>>He had so much power available, that he could afford spoiling it just to look
>>brighter than every other chess programmers.
>
>He didn't "just had so much". He spent years building something to give him
>that much power... that was part of the 'total package'...
OK, he had all this power as a result of his very hard work. So he deserved it.
But the total package could have been much better with a pruning scheme. The
thing that has played against Kasparov was far from being finished.
>>In my opinion, Deep Blue as it has been programmed is much weaker than what it
>>could have been. That is not to say it is not very strong.
>>
>
>I don't think even Hsu would disagree. They cut it _very_ close for both
>matches, not even having a month to play with the real chips each year. I
>am convinced it could be 100-200 points better with a couple of years of
>hard work at tuning/testing the full-blown hardware...
Certainly.
Christophe
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.