Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Have You looked at Deep Blue logs?!

Author: Vincent Diepeveen

Date: 10:45:09 01/17/00

Go up one level in this thread


On January 17, 2000 at 12:42:51, Jeremiah Penery wrote:

>On January 17, 2000 at 11:13:44, Vincent Diepeveen wrote:
>
>>On January 17, 2000 at 09:32:40, Robert Hyatt wrote:
>>
>>>On January 17, 2000 at 03:11:34, Jouni Uski wrote:
>>>
>>>>Yes stunnigly they are available at:
>>>>
>>>>http://www.research.ibm.com/deepblue/watch/html/c.html
>>>>
>>>>First thing I notice is, that search depth is not very convincing at all -
>>>>mostly 11-12 ply (also in endgame part)! Have I read log wrong way? Most
>>>>Chessbase engines got same ply in standard PC...
>>>>
>>>>Jouni
>>>
>>>
>>>It is a "different" kind of ply.  DB didn't use null-move of any kind.  They
>>
>>You can also search a lot less deeply without alfabeta pruning.
>>Even with an incredible amount of nodes you hardly will come deeper than
>>a few plies then.
>
>This means nothing.  Null-move and other types of forward pruning or selective
>search can cause you to miss things.  Hsu didn't want DB missing anything this
>way. (And don't tell me there is _no_ position where Diep fails because of a
>nullmove, or some other pruning, problem.)

There is no position where DIEP fails because of nullmove. I am using
double nullmove, invented by myself personally. This doesn't miss *anything*.

There are however a few exceptional positions where DIEP needs a bigger
depth, and given the fact that zugzwang is very common in pawn endgames and
that you search deeply in pawn endgame anyway, i am not doing a nullmove
there either. Yet the double nullmove solves the zugzwang cases in a very
simple way. A single zugzwang which sometimes happens in endgames
is easily detected. A zugzwang in middlegame also.

Note that in order to do threat extensions in deep blue, the deep blue
team would have needed a nullmove to detect threats, which would have
burned them another few nodes. I guess they just got rid of that, hoping
that singular extensions with some expansions would detect nearly as much
as threat extensions. Can't blame them doing that.

However i do a nullmove anyway in DIEP, so i already have that threat score.
Therefore i can do cheaply threat extensions if i want to, where singular
extensions is a completely different case. Need expensive searches for that,
and the smaller your branching factor and efficiency of your search,
the more it hurts (note that this is of course also true for threat
extensions; detection of it may be cheap, but not execution of it).

Note that DIEPs endgame evaluation is extremely weak compared to its
middlegame evaluation. There i DO get the same effect when applying
nullmove at smaller depths it tends sometimes to do a few crap moves before
switching to the same move and the scores are a bit different. I don't
have this behaviour of DIEP in middlegame however.

First implementations of parallellism to diep also showed completely different
scores and moves it played compared to single search. That's also more stable
now.

I wonder anyway how to do nullmove in hardware. It would lead to huge
instability in search. When using the whole hardware setup as in deep blue,
then it's rather tough to make a nullmove such that
  a) within a given time period hardware processor returns result;
     a 6 ply search with nullmove and a good alfa and beta might be
     only a few couples of nodes, where a 4 ply search with bad luck for
     alfa and beta and bad luck in position anyway, the nullmove will only
     need a bit more nodes needing way more time than a 6 ply search.
  b) apply a qsearch only, you only can get a 4 ply search out of
     a chessprocessor!
  c) apply a n depth search in hardware processor.
  d) why apply nullmove if in 1991 the conclusion of the deep blue II team is
     that nullmove is not speeding them up enough?

This though that is a huge misinterpretation. Nullmove gives my program easily
3 plies of extra search at tournament conditions, and easily 5 plies of
extra search at analysis depths. In endgames it even gives more.
I remember searching in WCCC not seldom at a quad xeon to depths of 20 ply
in the endgame, even with passed pawn on the board (for example against
Eugen diep got to 20 ply in the endgame, when it was already too late btw
to turn the tie...).

>>>also used singular extensions which cause the typical chess ram to search
>>>1-2 plies less deep in the typical case, although it will probe much deeper
>>
>>Nah, singular extensions were introduced when chiptest searched up to 8 ply
>>at tournament level. From experiences at ICC server i can ASSURE everyone
>>that 8 ply is not enough to see even some simple tactical tricks.
>
>What are you talking about?  In the same program with SE vs. without SE, the
>version with SE will search a ply or two less in iteration depth, but it will
>search deeper in critical lines.  This is exactly what Bob just said.  I have no

>idea what you're talking about here...

First of all it's big BS that it searches in critical lines deeper.
It searches deeper in what the program THINKS are forced lines.
Chess is not a FORCED MOVE game though. Only when you can put me mate,
and when i am about to resign, you usually can see a bit faster in how many
moves i get mated. So game is basically already over then.

But the depth issue is hard to understand for most;
this is VERY hard for humans to imagine. What i am saying is:
If we all search 2 ply, and chiptest gets 3 ply
with extensions, chiptest will KICK BUTT,
because it sees how to capture the queen with a knight check and you will not.

If we all search 8 ply this will already be a lot less. Eval gets more
important, but still tactics RULE the chessgame. Things that are obvious
for some humans, are NOT obvious for chessprograms, like someone who tries
to put you mate...

...now if we all search more than 10 ply WITH extensions, and you with a bit
more extensions than i do, then it will all make no difference. Best program
will win in the eyes of the large crowd then.

Personally i've always guessed at a safe 12 to 14 ply, but 10 ply in
eyes of most programmers is already a very important depth here.

I think 12 ply is an important depth if you hardly use extensions.

Yet at ICC server i see clearly that 6 to 7 ply is not enough.
Chiptest got 8 ply at a time that most didn't even get near 7 ply.

>>SURE singular extensions gave a lot back then, as no one got to 8 ply then,
>>so everyone was suffering from major tactical problems.
>>
>>Those get away when slowly getting above 10 ply, which is near to 6 moves
>>with some extensions, as was already estimated by De Groot back in 1966,
>>as being the 'basic plan depth of a master'.
>>
>>Later when they had deep thought II, getting 10 ply with an incredible
>>number of nodes a second, then the deep thought team themselves CLAIMED
>>that singular extensions didn't give much anymore. Let me quote:
>>
>>"the main conclusion that must be drawn from the experiments onse lective search
>>reported in this article is that while selective search, in particular
>>domain-independant selective search, does indeed improve the performance of a
>>chessprogram, no single heuristic is able to contribute a large improvement.
>>Moreover, many heuristics or variations that might appear promising in
>>theory turn out to be of little or no value in practics".
>
>That's not saying anything about SE.  That paragraph is talking about null-move
>and other pruning methods (Hint: it uses the words "selective search". SE is not
>selective search.)

O yes. in this report they dedicated a long paragraph at singular extensions,
which is a form of selective search.

>>I can support to some extend this opinion. Nearly everything a human
>>does is extremely knowledge based. I'm never gonna look at a certain
>>forced line if my brain doesn't feel like wnating to look like it.
>>alfa nor beta have to do with this (as far as you can as human be aware
>>of any values, i'm not).
>>
>>Then conclusion goes on:
>>
>>"the total improvement of the best combination of heuristics discovered
>>so far over brute force is 72 points (86 USCF). Of this 49 points is
>>due to threat extensions, 7 points is due to singular extensions, 5
>>points due toe PV lookahead extensions, with the reaminder due to a number of
>>less important extensions".
>>
>>I could not assigne values to the worth of extensions,
>>and I'm missing recapture extensions btw. Note that i could never proof them
>>working in tournament games either, but in blitz they kick butt, just
>>like mating extensions.
>>
>>I agree however with the conclusion. Everything a human does is knowledge
>>based. If we use a bit of knowledge, either dynamic or static, then
>>this means that search is gonna miss a lot. obviously you need something
>>to see deep threats/checks a bit deeper, to not lose all games because
>>of a mate in 7.
>>
>>However, i feel only evaluation is important. My current version of
>>DIEP is tactical simply weak. However if i turn back on all extensions it
>>used to have, then it's gonna score like 300 points at least more at
>>a given testset than the current version (the current version does it really
>>weak on those testsets, only win at chess gets still solved with 299 out
>>of 300, which btw is bigtime better than chiptest with 500k nps did, though
>>evaluation based).
>
>What were Chiptest's results?

I'll quote:

  "Of the 299 positions out of 300 with solutions
  chiptest fails to find the correct move in 2 cases. In another one the
  correct the correct move is played but not for the right reason, the
  remaining 296 are solved within 3 minutes or less at 500k nodes a second.
  without singular extensions it solves 289 within 3 minutes."

I have no idea which position they referring to.
I don't know a single position out of 300 that isn't having a solution, though
some have subsolutions which are also just as good, or maybe one had
the wrong solution, and later was modified to the correct solution
before i got the file under my eyes.

>>I feel however that at slow level (where diep gets >= 10 ply)
>>score of diep is not gonna be different a lot. How much difference
>>is at blitz we'll see within some weeks/months at the icc server against
>>other programs and against humans.
>>
>>>in the traditional sense.  One other unknown is whether their 10 plies includes
>>>the 4 plies of hardware search.  They don't count total nodes that I noticed
>>
>>If they print '11' then they did get 11 ply brute force, which INCLUDES
>>the chessprocessors.
>
>No it doesn't.  Because when they print '3', it obviously cannot include the 4
>extra from the chess-processors.  Note that the PVs they give also don't include
>stuff from the chess-processors.
>
>>It's impossible to get fullwidth without hashtables at chessprocessors,
>>with the way parallellism was set up,
>>and without the chessprocessors communicating hash etcetera with each
>>other to get 15 ply fullwidth, WITH a load of extensions.
>
>How do you assume they could communicate with each other?  IIRC, there was no
>shared hash table among the chess-processors.  It was a message passing setup,
>which would make what you're saying WAY too slow.
>
>>Note that Hsu describes that a typical 12 ply search was first 4 ply
>>at processor 1, then 4 ply at 30 different processors, then 4 ply in hardware.
>
>But I don't think that's the way it was shown in the log.
>
>>>anywhere, so it isn't easy to decide.  For me, 1M nodes per second gives a
>>>depth of 13-14 in the middlegame.  Going 200M should drive that to roughly
>>>log3(200) which is about 5 plies deeper.  So call that 18-19 plies.  Removing
>>>null-move would subtract 2, so 16-17 plies.  Singular extensions 1-2 more plies,
>>>so that would be (14-15) to (15-16) plies.
>>>Somewhere on their web site they mentioned 14 plies as "normal".  So their
>>>iteration number might be slightly different from "ours".  IE when I search
>>>until depth is reduced to zero, I go to the quiescence search.  They might
>>>go to their hardware, which is not exactly a "quiescence search" since it looks
>>>at all moves for 4 plies, but doesn't do any of the singular stuff and so forth.
>>>I'll try to ask next time I hear something from Hsu.
>>
>>He implicitly answerred that in april this year, apart from that it's
>>fullwidth not possible given the way it was implemented with all the
>>conditions and extensions.
>>
>>Deep thoughtII searched many tens of millions nodes a second and got 10 ply.
>>deep blue was more NPS optimized, and got just a ply or at most 2 ply
>>deeper. very logical.
>
>DTII searched 750k NPS.  DB searched about 100M NPS, while DB2 does about twice
>that.  DTII also had a much smaller evaluation than DB/DB2, and I assume it
>extended less, as well.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.