Author: Keith Ian Price
Date: 21:52:18 05/09/98
Sorry for the delay. Here is Part III of my Deep Blue Report: 8.There was a rather long thread going on about a month ago on rgcc, concerning whether DB was an example of Artificial Intelligence. During his presentation, Hsu gave his opinion on the subject. He stated that chess is considered a game of intelligent people, and DB was able to play the game against the best player in the world, and so therefore, it could be argued that DB had passed a Turing test of sorts, albeit a chess-specific Turing Test. However, Hsu continued, he did not think that this constituted intelligence. He did not directly support his decision, but did show a cartoon that occurred just after the match, in which Kasparov is playing Deep Blue and Kasparov's foot has slipped under the plug for Deep Blue. Both have "thought balloons" showing in the cartoon. DB's balloons show a bishop with diagonal arrows; a rook with horizontal and vertical arrows; and a King with short arrows going in all directions, etc. Kasparov's balloon shows his foot lifted, and the plug out of the socket. He said this represents the difference between a chess-specific intelligence, and real intelligence. If DB were losing, it would have no way to think of a different solution outside the bounds to which it had been programmed. 9. One of the longest-running arguments on rgcc and CCC has been how well micros might fare against Deep Blue. During the Deep Blue excitement last year the news slipped out that there had been a match between DB, Jr. and Rebel 8 and Genius. DB, Jr. was supposed to have been slowed down to match the PC's speed somehow. I asked Hsu about this 10-game match. He was quite familiar with the results. He confirmed that there had been 5 games against each opponent. He stated that there was only one chess processor used, and that it's clock speed had been halved. He also said that several pruning algorithms were turned off, with some selective extensions, in order to emulate the performance of the micro hardware as much as possible. They did this to see how well they did against the micros on an evaluation specific level, keeping the speed advantage down to the difference between what the micros could evaluate given their nps levels, and what could be accomplished in the chess specific processor evaluation, rather than how many nodes were searched. Since the speed of a single chess processor is about 2-2.5 million nodes per second, and Hsu estimated that the removal of the algorithms caused a 5-10 times reduction in nodes searched, the probable nps level for DB, Jr. was somewhere between 100,000 and 250,000 with the clock speed reduction factored in. This is similar to the fast searchers, but is probably 2-5 times faster than Rebel 8 at the time. In any case, I asked how the games went, and Hsu pulled no punches. He said that the performance of the micros was much poorer than he had imagined they would be. He said all 10 games were basically blowouts. When I asked for specifics, he mentioned two examples against Rebel that had surprised him as to how little understanding they had of endgames and King safety. In the first example, the ending was with bishops of opposite color and normally would have been a draw. Rebel allowed an exchange which gave DB two widely separated passed pawns, and there was no way to stop both. Rebel did not realize until a few more moves that it was in trouble. Hsu said this was the kind of thing that is in his evaluation routines, and he was surprised that it was not in Rebel's. The second example was where DB sacrificed a Rook for a pawn next to Rebel's King. After the exchange, Hsu reported, Rebel showed 2+ pawns advantage. DB showed a .5 pawn advantage. A couple moves later, DB went up to a much higher advantage, and Rebel still showed +2. After a few more moves, Rebel suddenly realized it was busted, and dropped its eval way down. Hsu thought this was due to a minimal King safety evaluation. He did state that even with this, he thought Rebel had a much better understanding of positional play than Genius did. I asked him if it were possible to get scores of these games. He said he did not want to release them, as he did not want to give out any help to future competitors. I mentioned that he had said the chance of Deep Blue ever giving another match were almost nil, and so there should not be any future competitors. He responded that if he got the rights to the chess processors, Rebel and Genius would likely be the future competitors, and he wanted to leave his options open. I stated that even so, once released, there would be thousands of games available rather quickly, and that these 10 would not make much difference. He said that he wasn't even sure if the game scores had been saved. I realized that he was not going to let them out, so I suggested that if he found them, not to erase them, as there were a lot of people interested in them, and I moved on. 10. Since we had been talking about evaluations and positional understanding, I took the time at this point to bring up my current favorite among the chess programs I have, Chess System Tal. I stated that I was impressed with the amount it accomplished within 3000 nodes per second searched. I said that if its evaluation was able to search at a much higher rate, that I thought it would be much better than the other micro programs. I was surprised by the enthusiasm Hsu showed about the program. He mentioned that how it handled King Safety was much more similar to Deep Blue than the other micro programs, although perhaps a little bit more extreme, and that he, too, was impressed with it. He said that many things in CSTal were implemented in Deep Blue, which I found strange, since it wasn't released until after the match. I didn't think of that until later, so I wasn't able to ask him about this. Perhaps he meant implemented similarly, or perhaps ideas from CSTal's style of play exhibited in games Thorsten posted. Or maybe he had access to a beta version, or he was referring to Complete chess system, I don't really know. I only mention this as it gives a little insight into the approach used in the evaluation. During the presentation Hsu stated that unlike other chess programs, DB's evaluation in not just a matter of adding weights together with bonuses to arrive at a score, but some functions were calculated non-linearly, through multiplication, or other "second level" methods. When asked about this, he said that some examples of the non-linear evaluations were in the method of calcualating a pawn's value based on it's advance, and its position relative to other pieces and pawns, and King safety, which was an example of what he had referred to as "second level" methods. This was a question and answer section at the end of the presentation, and since it wasn't my question, I could not ask him to expand on these generalisms. 11. There has been some question as to the endgame databases used during the match. Hsu stated that there were 20 gigabytes of endgame databases from Ken Thompson and Lewis Stiller on the hard drive. He said that they were all of the five-man and down, plus selected six-man endgame databases. To his knowledge, during the match, they were never accessed, but he was not sure of this. He said that since the chess processors have some of the engame databases built in (I have read that these are the 3-man set), he figured that it never got to the point where the SP2s would need to access the hard-disk-based databases. He said that it was probably a good psychological weapon for Kasparov to know that they were there, since, if he made one wrong move during the endgame, he would know that he would quickly look foolish in front of millions of people, and this would have to have an effect. Other differing reports about how many processors DB used were also answered. Deep Blue employed 30 SP2 Scalable Processors. The frames were capable of holding sixteen each, and there were two frames, but in each frame, two processors were tied together to form a master processor, which meant a total of 30 instead of thirty-two. Each SP2 had 16 chess processors attached, so that meant a total of 480 chess processors. Up until this point I had only heard 256 or 512. Hsu said that Deep Blue used "two-level parallelism" to process positions. He described this as the method of the master processor evaluating the first 4 moves, then sending the 1000 or so positions involved to the other SP2s, which would carry it out for another 4 moves, and then turn over the positions to the chess processors, which would go on for 4-5 more moves. He said that on average DB would reach to 30 ply in considering a move, but in certain cases it had reached, through selective extensions and pruning up to 70 ply, though this was rare. It would on average process 200 million chess positions per second, but that this reached as high as 400 million in certain cases. The chess processors made for the rematch were capable of 2-2.5 million nodes per second processed, and with improved evaluation with Joel Benjamin's help, and better selective search, the speed was improved by 3-10 times over the 1996 version. I asked how many cycles it took to evaluate a position, and was told that it varied. There was a short evaluation used approximately 80% of the time which took only one cycle, and there was a long evaluation used 20% of the time that took 8 cycles. Move generation took 4 cycles. There were 8000 adjustable evaluation features, and these included such things as the value of a rook on an unopened file which could later be forced open with a pawn exchange or sacrifice. He said this was one that was added through the help of GM Joel Benjamin, and he knew of one instance during the match when it had an effect. (I have not looked over the games to see where this would be, perhaps some helpful reader with more time could find this out.) It would be very interesting to know how these evaluations can be performed in hardware, but I am not sure that this will ever be covered, especially if Hsu is really thinking of a commercial version of the program. Since he also mentioned that he would be interested to see if a single-chip chess machine could be created to beat the world champion someday, he may not be forthcoming on his research, as would be hoped. 12. Hsu evidently had difficulty in convincing the rest of the team to switch to a redesigned chess processor between the match and rematch. Since the lead time for a chess processor was normally a year for design, testing, and debug, and since they only had 1 year and three months until the rematch, they were more interested in tweaking the program in the SPs, and leaving the chips alone. Hsu said he worked for 6 months, 70-100 hours per week, redesigning the chess processors. When he had them ready, and began the tests to see how well they performed relative to the older chips, the difference was so great that the rest of the team quickly agreed to switch to the new processors and so continued on from there. Well, there was more, but this concludes the report for this forum. Most of the rest is anecdotal, and not so informative, so I will stop here. I hope it was interesting. kp
This page took 0.04 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.