Computer Chess Club Archives


Search

Terms

Messages

Subject: Deep Blue--Part III

Author: Keith Ian Price

Date: 21:52:18 05/09/98


Sorry for the delay. Here is Part III of my Deep Blue Report:

8.There was a rather long thread going on about a month ago on rgcc,
concerning whether DB was an example of Artificial Intelligence. During
his presentation, Hsu gave his opinion on the subject. He stated that
chess is considered a game of intelligent people, and DB was able to
play the game against the best player in the world, and so therefore, it
could be argued that DB had passed a Turing test of sorts, albeit a
chess-specific Turing Test. However, Hsu continued, he did not think
that this constituted intelligence. He did not directly support his
decision, but did show a cartoon that occurred just after the match, in
which Kasparov is playing Deep Blue and Kasparov's foot has slipped
under the plug for Deep Blue. Both have "thought balloons" showing in
the cartoon. DB's balloons show a bishop with diagonal arrows; a rook
with horizontal and vertical arrows; and a King with short arrows going
in all directions, etc. Kasparov's balloon shows his foot lifted, and
the plug out of the socket. He said this represents the difference
between a chess-specific intelligence, and real intelligence. If DB were
losing, it would have no way to think of a different solution outside
the bounds to which it had been programmed.

9. One of the longest-running arguments on rgcc and CCC has been how
well micros might fare against Deep Blue. During the Deep Blue
excitement last year the news slipped out that there had been a match
between DB, Jr. and Rebel 8 and Genius. DB, Jr. was supposed to have
been slowed down to match the PC's speed somehow. I asked Hsu about this
10-game match. He was quite familiar with the results. He confirmed that
there had been 5 games against each opponent. He stated that there was
only one chess processor used, and that it's clock speed had been
halved. He also said that several pruning algorithms were turned off,
with some selective extensions, in order to emulate the performance of
the micro hardware as much as possible. They did this to see how well
they did against the micros on an evaluation specific level, keeping the
speed advantage down to the difference between what the micros could
evaluate given their nps levels, and what could be accomplished in the
chess specific processor evaluation, rather than how many nodes were
searched. Since the speed of a single chess processor is about 2-2.5
million nodes per second, and Hsu estimated that the removal of the
algorithms caused a 5-10 times reduction in nodes searched, the probable
nps level for DB, Jr. was somewhere between 100,000 and 250,000 with the
clock speed reduction factored in. This is similar to the fast
searchers, but is probably 2-5 times faster than Rebel 8 at the time. In
any case, I asked how the games went, and Hsu pulled no punches. He said
that the performance of the micros was much poorer than he had imagined
they would be. He said all 10 games were basically blowouts. When I
asked for specifics, he mentioned two examples against Rebel that had
surprised him as to how little understanding they had of endgames and
King safety. In the first example, the ending was with bishops of
opposite color and normally would have been a draw. Rebel allowed an
exchange which gave DB two widely separated passed pawns, and there was
no way to stop both. Rebel did not realize until a few more moves that
it was in trouble. Hsu said this was the kind of thing that is in his
evaluation routines, and he was surprised that it was not in Rebel's.
The second example was where DB sacrificed a Rook for a pawn next to
Rebel's King. After the  exchange, Hsu reported, Rebel showed 2+ pawns
advantage. DB showed a .5 pawn advantage. A couple moves later, DB went
up to a much higher advantage, and Rebel still showed +2. After a few
more moves, Rebel suddenly realized it was busted, and dropped its eval
way down. Hsu thought this was due to a minimal King safety evaluation.
He did state that even with this, he thought Rebel had a much better
understanding of positional play than Genius did. I asked him if it were
possible to get scores of these games. He said he did not want to
release them, as he did not want to give out any help to future
competitors. I mentioned that he had said the chance of Deep Blue ever
giving another match were almost nil, and so there should not be any
future competitors. He responded that if he got the rights to the chess
processors, Rebel and Genius would likely be the future competitors, and
he wanted to leave his options open. I stated that even so, once
released, there would be thousands of games available rather quickly,
and that these 10 would not make much difference. He said that he wasn't
even sure if the game scores had been saved. I realized that he was not
going to let them out, so I suggested that if he found them, not to
erase them, as there were a lot of people interested in them, and I
moved on.

10. Since we had been talking about evaluations and positional
understanding, I took the time at this point to bring up my current
favorite among the chess programs I have, Chess System Tal. I stated
that I was impressed with the amount it accomplished within 3000 nodes
per second searched. I said that if its evaluation was able to search at
a much higher rate, that I thought it would be much better than the
other micro programs. I was surprised by the enthusiasm Hsu showed about
the program. He mentioned that how it handled King Safety was much more
similar to Deep Blue than the other micro programs, although perhaps a
little bit more extreme, and that he, too, was impressed with it. He
said that many things in CSTal were implemented in Deep Blue, which I
found strange, since it wasn't released until after the match. I didn't
think of that until later, so I wasn't able to ask him about this.
Perhaps he meant implemented similarly, or perhaps ideas from CSTal's
style of play exhibited in games Thorsten posted. Or maybe he had access
to a beta version, or he was referring to Complete chess system, I don't
really know. I only mention this as it gives a little insight into the
approach used in the evaluation. During the presentation Hsu stated that
unlike other chess programs, DB's evaluation in not just a matter of
adding weights together with bonuses to arrive at a score, but some
functions were calculated non-linearly, through multiplication, or other
"second level" methods. When asked about this, he said that some
examples of the non-linear evaluations were in the method of
calcualating a pawn's value based on it's advance, and its position
relative to other pieces and pawns, and King safety, which was an
example of what he had referred to as "second level" methods. This was a
question and answer section at the end of the presentation, and since it
wasn't my question, I could not ask him to expand on these generalisms.

11. There has been some question as to the endgame databases used during
the match. Hsu stated that there were 20 gigabytes of endgame databases
from Ken Thompson and Lewis Stiller on the hard drive. He said that they
were all of the five-man and down, plus selected six-man endgame
databases. To his knowledge, during the match, they were never accessed,
but he was not sure of this. He said that since the chess processors
have some of the engame databases built in (I have read that these are
the 3-man set), he figured that it never got to the point where the SP2s
would need to access the hard-disk-based databases. He said that it was
probably a good psychological weapon for Kasparov to know that they were
there, since, if he made one wrong move during the endgame, he would
know that he would quickly look foolish in front of millions of people,
and this would have to have an effect. Other differing reports about how
many processors DB used were also answered. Deep Blue employed 30 SP2
Scalable Processors. The frames were capable of holding sixteen each,
and there were two frames, but in each frame, two processors were tied
together to form a master processor, which meant a total of 30 instead
of thirty-two. Each SP2 had 16 chess processors attached, so that meant
a total of 480 chess processors. Up until this point I had only heard
256 or 512. Hsu said that Deep Blue used "two-level parallelism" to
process positions. He described this as the method of the master
processor evaluating the first 4 moves, then sending the 1000 or so
positions involved to the other SP2s, which would carry it out for
another 4 moves, and then turn over the positions to the chess
processors, which would go on for 4-5 more moves. He said that on
average DB would reach to 30 ply in considering a move, but in certain
cases it had reached, through selective extensions and pruning up to 70
ply, though this was rare. It would on average process 200 million chess
positions per second, but that this reached as high as 400 million in
certain cases. The chess processors made for the rematch were capable of
2-2.5 million nodes per second processed, and with improved evaluation
with Joel Benjamin's help, and better selective search, the speed was
improved by 3-10 times over the 1996 version. I asked how many cycles it
took to evaluate a position, and was told that it varied. There was a
short evaluation used approximately 80% of the time which took only one
cycle, and there was a long evaluation used 20% of the time that took 8
cycles. Move generation took 4 cycles. There were 8000 adjustable
evaluation features, and these included such things as the value of a
rook on an unopened file which could later be forced open with a pawn
exchange or sacrifice. He said this was one that was added through the
help of GM Joel Benjamin, and he knew of one instance during the match
when it had an effect. (I have not looked over the games to see where
this would be, perhaps some helpful reader with more time could find
this out.) It would be very interesting to know how these evaluations
can be performed in hardware, but I am not sure that this will ever be
covered, especially if Hsu is really thinking of a commercial version of
the program. Since he also mentioned that he would be interested to see
if a single-chip chess machine could be created to beat the world
champion someday, he may not be forthcoming on his research, as would be
hoped.

12. Hsu evidently had difficulty in convincing the rest of the team to
switch to a redesigned chess processor between the match and rematch.
Since the lead time for a chess processor was normally a year for
design, testing, and debug, and since they only had 1 year and three
months until the rematch, they were more interested in tweaking the
program in the SPs, and leaving the chips alone. Hsu said he worked for
6 months, 70-100 hours per week, redesigning the chess processors. When
he had them ready, and began the tests to see how well they performed
relative to the older chips, the difference was so great that the rest
of the team quickly agreed to switch to the new processors and so
continued on from there.

Well, there was more, but this concludes the report for this forum. Most
of the rest is anecdotal, and not so informative, so I will stop here. I
hope it was interesting.

kp



This page took 0.04 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.