Author: Don Dailey
Date: 14:09:15 12/09/98
Go up one level in this thread
On December 09, 1998 at 15:07:05, John Coffey wrote: >On December 09, 1998 at 06:36:02, Mark Young wrote: > >>Many of use have played over the games of Deep Blue Vs GM Kasparov , and Rebel >>10 Vs GM Anand. >> >>My question is what do you think would be the stronger chess program, and by how >>much: >> >>Deep Blue, or Rebel 10 (K6 450Mhz) * 1000 > > >My understanding is the Deep Blue is designed to be fast as possible, so we can >assume that its evaluation function is relatively simple. But slower programs >often have to have a more complex evaluation to make up for the lack of speed. > >If by some miracle Rebel 10 could examine as many nodes as Deep Blue then it >would probably win. This is not ridiculous nor impossible nor would it take >a hundred years to happen (maybe.) If you look at how computers have gotten >a thousand times or more faster over the last 20 years, then is it possible >that they could get a thousand times faster over the next 20 years? We don't >know the answer yet because we don't know if we will hit theoretical limits. If >we do hit such limits then will we be able to find ways around them? > >Hang onto that Rebel 10 program. I will be curious just how it plays 20 years >from now. I wonder if we will still have DOS 20 years from now? (Or Windows?) >Both will probably require some sort of emulator to run. > >John Coffey Hi John, Deep Blue does not have a "simple" evaluation function by any means. It probably easily has as many terms as any micro, maybe more that any or most micro's. But term count doesn't mean a whole lot when you don't know exactly what those terms are and what they do. My program for instance has 202 terms in it, not very many but it covers a whole lot of ground. 6 of these terms are basic piece values and a few more are terms that modify these values depending on the situation. But it is very simple to add terms to your program. The only question is how does this affect the strength of your program? If all I had to do was add terms to make the program stronger, I would gleefully add hundreds of thousands of terms. The reason I bring this up, is that several months ago, someone used Deep Blue's term count as evidence for how strong it must be. I don't remember or care who said this, but at the time I realized this is a horribly poor way to measure chess strength. As far as evaluation is concerned, first order business is how well the weights of the terms are adjusted, followed closely by how well the terms are chosen (or what they actually measure.) The least important factor is the actual number of terms used. You can of course play tradeoff games, 1000 weak terms might very well be equivalent to 100 solid well chosen terms. But it is NEVER bad to have extra terms (assuming you don't care about the slowdown which of course is not an issue with Deep blue) the only issue is how much good are they actually doing. The fact that Deep Blue has a huge number of terms is some evidence that it has a high quality evaluation, I believe it probably does have a very good evaluation. But it is not proof by any means. About the issue of how much speedup is necessary to (Rebel for instance on a 450 mhz) equal the current Deep Blue, it's a pretty open question. I have given my educated guess of at least 5x, but no more than 100x. This range reflects that no one really knows how good Deep Blues is. But we can certainly make an educated guess and this is so much fun to do that it goes around on this group every few months. Part of my "guess" is based on the fact that the version of Deep Blue that last played against other micro's lost a game to a 90 MHZ pentium and drew an engame against a slightly faster pentium. You cannot accurately measure chess strength based on a sample of games this tiny, but what you can do is start to build a reasonable upper bound. Deep Blue has been modified since then of course and is much stronger. But the micro's that competed in this tournament are also a factor of 4x or so better, they would finish near the bottom today in a similar tournament without more up to date hardware. This is not to mention that the software is a lot better too. It was certainly a fluke that Deep Blue didn't do a lot better at this tournament, clearly it was best by a significant margin and was unlucky. But what this tournament revealed, in my opinion, was that the program at that time was not likely to be more than a couple of hundred rating points stronger than the other good programs of that day. It would be unlikely to score this poorly if you were to assume it was at least 300 rating points stronger than the BEST entry at the tournament. 300 points better than the AVERAGE entry, certainly. Having said this, it's still just a lot of guesswork. It's possible that Deep Blue was hundreds of points better than Rebel and was the victim of an incredibly rare statistical anomoly. But throughout it's illustrious history, although it has clearly dominated everyone else, it will from time to time take a loss or a draw from a micro. It's not unbeatable and we don't have to talk these ridiculous numbers (3 orders of magnitude, 1000x?) to get equivalence to powerful modern day programs. I would also like to suggest that when you talk 100X hardware improvement you are talking about 100X more memory too. If you speed up Rebel 1000X the extra memory will be critically important to it's performance. - Don
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.