Author: Robert Hyatt
Date: 18:46:40 01/29/06
Go up one level in this thread
On January 29, 2006 at 19:29:30, Vasik Rajlich wrote: >On January 29, 2006 at 12:07:52, Albert Silver wrote: > >>On January 29, 2006 at 11:55:59, Uri Blass wrote: >> >>>On January 29, 2006 at 10:03:02, Albert Silver wrote: >>> >>>>On January 29, 2006 at 07:12:15, enrico carrisco wrote: >>>> >>>>>Reminds me of Deep Thought -- using the hardware for the last N plies. This >>>>>type of tactical search works real efficiently to see danger from your opponent >>>>>but less efficient in finding chances for itself (ex: Genius.) Tactically it >>>>>makes it very strong but not so efficient in king attacks compared to Fritz or >>>>>Hiarcs. Hence, on test positions it does slightly worse (just like Fruit.) >>>> >>>>Would that really be the reason? As you probably know, one can significantly >>>>improve its ability with test suites, by simply increasing the 'Optimism' in the >>>>outlook. >>>> >>>> Albert >>> >>>Only on test suites that you need to fail high to find the move and not in test >>>suite that you need to fail low. >>> >>>I think that a poosible test to test positional understanding is the following >>>test: >>> >>>1)Use unequal time control so the result of both programs is 50% >>>2)Take all the games when there is disagreement between the programs about the >>>question which side is better(both programs evaluates the position as at least >>>0.25 pawns advantage for itself for at least 3 consecutive moves). >>> >>>3)calculate the result in the relevant games >>> >>>The program that score better in the games probably has a better positional >>>understanding. >>> >>>Uri > >There is one issue. > >Let's say that I change Rybka's eval to return eval () + 200 centipawns. Rybka >will then get butchered in this test, but the overall program level would be >preserved and (I would argue) the positional level would be preserved as well. > >In other words, is an evaluation responsible for absolute accuracy, or accuracy >relative to other likely positions within the same search? This is an age-old argument. Personally, my goal has always been to simply evaluate better positions as bigger numbers than worse positions. With no effort spent to trying to tie "centipawns" to some sort of positional edge that a human would agree with. But then again, no human I know has ever said (and this includes world-class GMs down to patzers) white is .17 better in this position. :) Yes it would be nice that if a program says +.50, that we could look at the position and agree "white is about 1/2 pawn better". But in reality, all I care about is that my program picks the move that leads to the largest positional score, not assuming that the positional score has any direct correlation to some absolute value everyone understands... Of course, the opposite case can be made... "I want the program to analyze my games and tell me what is happening." And when one program says +1.6 and another says +3.6, what is wrong with them??? I've seen "happy programs" and "pessimistic programs". Yet both played good chess... > >> >>I think that's complicated. Suppose in a position Rybka thinks it is better by >>0.40 pawns, and Fritz thinks IT is better by the same amount. In the next 3-4 >>moves, Rybka's evaluation goes up, so that it is 0.60 ahead, and Fritz goes down >>to 0.25. The game is hard fought, with no clear bludners after this and ends in >>a draw. Who was right? >> > >Both sides should get the same credit for this game from the test. Sample size >will eventually smooth out the "luck". > >Anyway, Uri's idea is in principle not bad. Everybody loves to talk about how >their program is full of "knowledge" - let's find some way to measure it. > >Vas Careful, you will open Pandora's box here... If you know what I mean... > >>You might argue Rybka was more correct because its evaluation went up, and >>Fritz's went down, but what if the position had been simply equal, and Fritz had >>simply realized its 'advantage' wasn't what it thought it was. >> >>Now what if instead, Rybka had actually been right, and it had been better, but >>the best mvoes were not found to maintain or increase its advantage? You would >>need to do a lot of searching to find this, and in the end, all you might really >>find is that for that specific position, one engine was better than the other, >>and not a general qualitative positional comparison. >> >> Albert
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.