Author: Howard Exner
Date: 12:25:29 12/10/97
Go up one level in this thread
On December 09, 1997 at 16:17:35, Don Dailey wrote: >These are fun to run but I don't think problem solution times tell you >enough about the strength of a chess program. We discovered long ago >that it is easy to make your program perform well on problem sets but >most of these "enhancements" weaken the program too much. A few simple >well known algorithms might make 3 or 4 of these problems solve an order >of magnitude faster and would imply a huge improvement to the strength >of the program when it truth the program has been slightly weakened. >Why is this so? > >I have a theory about this. Essentially, when you give some move >extension you are dramatically improving the programs performance in >a very limited subset of actual positions you will encounter in real >chess play. BUT AT THE SAME TIME, you are weakening the program >slightly >in ALL positions where the extension does not provide additional insight >and this is MOST positions. So you are playing most of the game >slightly >weakened, hoping for a won position to show off your tactics (you must >have a won position for any tactics to be successful.) I think this is true. The game of chess offers diverse possibilities. Karpov often likes to quote "there are many roads that lead to Rome" in drawing analogies to the many winning paths available in a chess game. > >I've talked to a lot of programmers about this and a lot of them don't >seem to mind the 30 or 40 percent slowdown they get, they pass it off >as being completely unimportant compared to finding checkmates or other >flashy tactics. >But I think it's critically important. >Every move the program makes involves some battle for something, and >that something might be just a minor positional thing, such as getting >to castle a little earlier, or getting an extra square of mobility for >the bishop etc. >It does not take much of a slowdown before a chess program starts losing >a significant number of these "little skirmishes." Most observers, >including >the programmers themselves will notice that their beloved program lost >the >war but never notice how many battles were lost. We simply rarely >notice >the little tactics of positional play, and if we do, we tend to believe >that the programs' evaluation function itself was in error. Accumulation of small positional advantages is often a prelude to the tactical finish. > >Also keep in mind that most problem sets begin with unatural positions, >namely positions where one side already has a win and the game is >essentially over if the program can find the "right" move. To extrapolate chess strength on the quickness of solution times would probably to a better indicator of blitz strength. Same is the error in concluding that because a program is strong at blitz it will be the strongest in 40/2. >There do seem to be some extensions that work pretty well that have been >discovered. They are generally the type of moves that affect a really >large percentage of positions (such as recaptures and checks.) There >are some other techniques that can be applied in the quies search that >can pick up problems several ply earlier in some cases which I'm pretty >sure Mchess leans heavily upon. But it's unclear to me if they really >are >a benefit. It seems clear that most programs differ quite significantly >and they all choose different sets of tradeoffs. That's what makes it >so much fun! > >I once noticed that the top rated programs on the rating list varied >significantly in their tactical abilities. In some cases the best >program was significantly worse (relative to the top few) in >problem solving ability! Occasionally we've also seen exceptional >problem >solvers with mediocre ratings. > >There is a definite correlation between problem solving ability and >chess strength, but it does not seem to be as strong as our intuition >suggest it should be. I'm beginning to believe that the only valid indicator of assessing computer chess strength is to rely on computer vs humans at 40/2. There are other indicators but none as accurate as the traditional way we rate ourselves, namely tournament play. Too bad there are so few human vs computer events out there. Anyway, I felt I had to comment here since I agree with your above summary on test suites.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.