Author: Vasik Rajlich
Date: 14:33:22 01/30/06
Go up one level in this thread
On January 29, 2006 at 21:46:40, Robert Hyatt wrote: >On January 29, 2006 at 19:29:30, Vasik Rajlich wrote: > >>On January 29, 2006 at 12:07:52, Albert Silver wrote: >> >>>On January 29, 2006 at 11:55:59, Uri Blass wrote: >>> >>>>On January 29, 2006 at 10:03:02, Albert Silver wrote: >>>> >>>>>On January 29, 2006 at 07:12:15, enrico carrisco wrote: >>>>> >>>>>>Reminds me of Deep Thought -- using the hardware for the last N plies. This >>>>>>type of tactical search works real efficiently to see danger from your opponent >>>>>>but less efficient in finding chances for itself (ex: Genius.) Tactically it >>>>>>makes it very strong but not so efficient in king attacks compared to Fritz or >>>>>>Hiarcs. Hence, on test positions it does slightly worse (just like Fruit.) >>>>> >>>>>Would that really be the reason? As you probably know, one can significantly >>>>>improve its ability with test suites, by simply increasing the 'Optimism' in the >>>>>outlook. >>>>> >>>>> Albert >>>> >>>>Only on test suites that you need to fail high to find the move and not in test >>>>suite that you need to fail low. >>>> >>>>I think that a poosible test to test positional understanding is the following >>>>test: >>>> >>>>1)Use unequal time control so the result of both programs is 50% >>>>2)Take all the games when there is disagreement between the programs about the >>>>question which side is better(both programs evaluates the position as at least >>>>0.25 pawns advantage for itself for at least 3 consecutive moves). >>>> >>>>3)calculate the result in the relevant games >>>> >>>>The program that score better in the games probably has a better positional >>>>understanding. >>>> >>>>Uri >> >>There is one issue. >> >>Let's say that I change Rybka's eval to return eval () + 200 centipawns. Rybka >>will then get butchered in this test, but the overall program level would be >>preserved and (I would argue) the positional level would be preserved as well. >> >>In other words, is an evaluation responsible for absolute accuracy, or accuracy >>relative to other likely positions within the same search? > >This is an age-old argument. Personally, my goal has always been to simply >evaluate better positions as bigger numbers than worse positions. With no >effort spent to trying to tie "centipawns" to some sort of positional edge that >a human would agree with. But then again, no human I know has ever said (and >this includes world-class GMs down to patzers) white is .17 better in this >position. :) > >Yes it would be nice that if a program says +.50, that we could look at the >position and agree "white is about 1/2 pawn better". But in reality, all I care >about is that my program picks the move that leads to the largest positional >score, not assuming that the positional score has any direct correlation to some >absolute value everyone understands... > >Of course, the opposite case can be made... "I want the program to analyze my >games and tell me what is happening." And when one program says +1.6 and >another says +3.6, what is wrong with them??? I've seen "happy programs" and >"pessimistic programs". Yet both played good chess... > For this question, Uri's test would be perfect. I think though if we're honest, most of us care about the first question more :) > >> >>> >>>I think that's complicated. Suppose in a position Rybka thinks it is better by >>>0.40 pawns, and Fritz thinks IT is better by the same amount. In the next 3-4 >>>moves, Rybka's evaluation goes up, so that it is 0.60 ahead, and Fritz goes down >>>to 0.25. The game is hard fought, with no clear bludners after this and ends in >>>a draw. Who was right? >>> >> >>Both sides should get the same credit for this game from the test. Sample size >>will eventually smooth out the "luck". >> >>Anyway, Uri's idea is in principle not bad. Everybody loves to talk about how >>their program is full of "knowledge" - let's find some way to measure it. >> >>Vas > >Careful, you will open Pandora's box here... If you know what I mean... > That's what message boards are for :) Vas > > >> >>>You might argue Rybka was more correct because its evaluation went up, and >>>Fritz's went down, but what if the position had been simply equal, and Fritz had >>>simply realized its 'advantage' wasn't what it thought it was. >>> >>>Now what if instead, Rybka had actually been right, and it had been better, but >>>the best mvoes were not found to maintain or increase its advantage? You would >>>need to do a lot of searching to find this, and in the end, all you might really >>>find is that for that specific position, one engine was better than the other, >>>and not a general qualitative positional comparison. >>> >>> Albert
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.