Author: Robert Hyatt
Date: 13:41:37 11/25/97
Go up one level in this thread
On November 25, 1997 at 13:04:41, Chris Whittington wrote: > >On November 25, 1997 at 12:40:20, Robert Hyatt wrote: > >>On November 25, 1997 at 05:10:27, Chris Whittington wrote: >> >>> >>>On November 25, 1997 at 02:29:28, Howard Exner wrote: >>> >>>>On November 24, 1997 at 13:26:16, Robert Hyatt wrote: >>>> >>>> >>>>>In light of my testing, I'd simply call this a "broken" test position >>>>>and >>>>>throw it out. Anything but the knight sac loses outright, and most >>>>>programs >>>>>that can reach reasonable depth see this. I'd bet Fritz finds it quite >>>>>quickly as well. But the solution is wrong, because the goal of the >>>>>test >>>>>was to test knowledge to see if a program could recognize that this is a >>>>>draw. To do so requires an evaluation of 0.00, not -3. something, >>>>>because >>>>>there are plenty of -3 positions that are still dead lost. >>>>> >>>>>The point here, then, is only to search deeply enough to see that this >>>>>move >>>>>is the only way to avoid scores of -4 and worse. I ran it on Cray Blitz >>>>>and >>>>>it found this in 8 seconds, and liked the knight sac from then on. But >>>>>the >>>>>score never went above -3.8 or so, although I only let it search to >>>>>depth=21. >>>>>It averaged about 9.7 million nodes per second for comparison, but never >>>>>had >>>>>a clue that this was drawn, just that it was playing the only move that >>>>>didn't >>>>>lose within its horizon. (I don't have the output in front of me, but >>>>>believe >>>>>it found the knight sac at depth=16 or perhaps 17. I can rerun it if >>>>>this is >>>>>important... >>>>> >>>>> >>>>>I don't count such "solutions" since I know that for every such lucky >>>>>correct >>>>>find, there are hundreds where such a knight sac only makes things >>>>>easier for >>>>>the opponent... >>>> >>>>Yes I agree about the knight sac could make things worse but does >>>>that apply to the dynamics of this type of position, namely the >>>>wrong bishop theme? What puzzles me on this position is that your >>>>program and I assume others would avoid capturing the pawns as you >>>>have noted. So the programs somehow "know" half the truth of this >>>>draw. The other half would be to "know" that the captures are essential >>>>to win. >>> >>>I agree. This is clearly a very interesting position that throws much >>>light on knowledge/search debate. >>> >>>To 'throw it out' as Bob suggests is a travesty. Presumably allowing the >>>one-eyed man to carry on being king in the land of the blind. >>> >>>But to give a 1 or a 0 for 'solving' it, is also a travesty. >> >> >>I say throw it out because it can be "solved" without being *solved*. >>That >>is, Na5 is the only move that doesn't lose, when the search is fast >>enough >>and deep enough to see why. So the right move is forced to avoid >>losing, >>which is *not* a knowledge test at all. Now if we change the nature of >>the >>solution so that the evaluation mst be 0.00 (or whatever your normal >>draw >>score is) then that would change things a lot. But as it is, it can be >>solved but not really *solved*... > >Absolutely agreed. What's needed is a comment on the position to say >that only Na5 with a draw evaluation woudl be accepted as a solution, >Then the test moves on from being a materialistic 1 or -1, to some sort >of quality test. > >In fact we'ld need to adjust the idea of the 0.00 score, because a >program could do, say: > >Na1 -4.00 150 secs >Na5 fail high >= -3.4 200 secs >Na5 0.00 350 secs > >wher it would be apparent that the fail high at -3.4 was actually a >solution. Gets complex, no ? :) > >Also my program doesn't score draws at 0.00. It can make them +ve or -ve >at will, depending on circumstances. Generally I (or Thorsten) know when >a draw score is output, but this tends to be intuitive. > >Chris Whittington > > I also have a dynamic draw score. In normal games it varies from X to X- 1.00, where X is normally zero, but can be changed by the operator if the opponent is much stronger or much better. (IE playing on ICC, Crafty will set the base draw score to 0 for "equal" opponents, or to +something for better opponents or -something for worse opponents. It then tweaks this value based on the time remaining for both sides, so it will try to avoid draws if the opponent is low on time, or try to take them if it is low on time. But for test suites, I generally force draw to 0.00 in all cases, so I can figure out what is being seen... There is one other issue for "correct" answers... Na5++ Na5 -3.6 Na5++ Na5 -1.7 Na5++ Na5 0.00 Most would count the first Na5 as the time to solution, which is wrong. I would count the last fail high because the next score resolved is draw. I've always scored WAC like this (right answer, right eval) because a couple can be solved by lucky evaluations that make the right move look a little better, but then a deeper search shows that right move wins lots of material... Very confusing to compare results like this of course..
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.