Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty Static Evals 2 questions

Author: martin fierz

Date: 02:24:50 02/27/04

Go up one level in this thread


On February 26, 2004 at 23:14:22, Robert Hyatt wrote:

>On February 26, 2004 at 17:54:09, martin fierz wrote:
>
>>On February 26, 2004 at 13:17:50, Robert Hyatt wrote:
>>
>>>On February 26, 2004 at 06:59:37, martin fierz wrote:
>>>
>>>>On February 25, 2004 at 12:30:38, Robert Hyatt wrote:
>>>>
>>>>>On February 25, 2004 at 12:09:16, Daniel Clausen wrote:
>>>>>
>>>>>>On February 25, 2004 at 10:52:27, Robert Hyatt wrote:
>>>>>>
>>>>>>>On February 25, 2004 at 05:56:16, martin fierz wrote:
>>>>>>
>>>>>>[snip]
>>>>>>
>>>>>>>>i don't know whether i should believe the eval discontinuity thing. i know
>>>>>>>>somebody recently quoted a paper on this, but it's just a fact: exchanging any
>>>>>>>>pieces necessarily changes the evaluation. sometimes not by very much. big
>>>>>>>>changes are usually the exchange of the queen, the exchange of the last rook,
>>>>>>>>the exchange of the last piece. these eval discontinuities are *real*. i don't
>>>>>>>>believe in smoothing them out. perhaps if you write an eval with
>>>>>>>>discontinuities it's harder to get it right that everything fits in with each
>>>>>>>>other, and that's why it's supposed to be bad?!
>>>>>>>
>>>>>>>No.  When you have a discontinuity, you give the search something to play with,
>>>>>>>and it can choose when to pass over the discontinuity, sometimes with
>>>>>>>devastating results..
>>>>>>
>>>>>>The arguments of you two could be combined to this:
>>>>>>
>>>>>>   Eval discontinuities are _real_ but it hurts the search too much and
>>>>>>   therefore it's better to be a tad less realistic in eval here in order
>>>>>>   to get maximum performance out of the search+eval.
>>>>>>
>>>>>>
>>>>>>Does that make any sense?
>>>>>>
>>>>>>Sargon
>>>>>
>>>>>
>>>>>That is not quite the issue.  Consider the following X-Y plot of your
>>>>>eval function (Y axis) against some positional component (X-axis):
>>>>>
>>>>>   |
>>>>>   |
>>>>>   |
>>>>>   |      *
>>>>> E |* * *   * *
>>>>> V |            * *
>>>>> A |                * *
>>>>> L |
>>>>>   |
>>>>>   |
>>>>>   |
>>>>>   |
>>>>>   |
>>>>>   |
>>>>>   |                    * * * * * * * * * * * * * * * * * * * * * * *
>>>>>   |_________________________________________________________________
>>>>>                   some feature you are evaluating
>>>>>
>>>>>Notice the sudden drop to zero.  If you start off in a position where the score
>>>>>is non-zero for this term, and you can search deep enough to drive over the
>>>>>"cliff" for this term and hit zero, strange things happen.  The search can use
>>>>>this as a horizon-effect solution to some problem.  And it will be able to use
>>>>>that sudden drop (when something goes too far) as opposed to the big bonus just
>>>>>before it goes too far, to manipulate the score, the path, the best move, and
>>>>>possibly the outcome of the game.
>>>>>
>>>>>This is what Berliner's paper was about.  I suspect that anybody that has worked
>>>>>on a chess engine for any length of time has run across this problem and had to
>>>>>solve it by smoothing that sudden drop so that there is no "edge condition" that
>>>>>the search can use to screw things up.
>>>>
>>>>another reason for not believing this stuff: your above graph shows *exactly*
>>>>what happens when you go from a non EGTB position to an EGTB position (or, for
>>>>that matter, what happens when you go into any position your program can
>>>>recognize as a draw whether it has tablebases or not): your eval thinks it's
>>>>doing great, but the exchange of something leads to a drawn position in your
>>>>tablebases. are you going to claim that crafty plays better without TBs?
>>>>:-)
>>>
>>>Nope, not the same thing.  EGTB info is _perfect_.  The eval is not.
>>
>>why did i know you would say that? :-)
>>
>>i just don't believe it. perhaps the eval is not perfect, so what? if your
>>argument is correct, then there must be some threshold for the "degree of
>>correctness" for the eval discontinuity to work. if it's "correct enough", it
>>will work - like EGTB info which has 100% correctness. what makes you think
>>other eval terms cannot be "correct enough"?
>>
>>cheers
>>  martin
>
>All I can say is that _everybody_ has seen the effect.  It is well-known, and
>causes problems.  One example is just say "endgame starts here" with a specific
>material count, and watch what happens.  When you are right around that material
>level, you will see odd things happen, from making poor positional moves that
>lose the game, to avoiding making good moves, because the program either wants
>or doesn't want to "cross the bridge".  If you make the transition smoother,
>then there is no bridge to cross, just a small step at a time and you end up
>where you want without the new type of horizon effect problems a discontinuity
>causes.
>
>Of course, if you don't believe it, that is perfectly fine.  But I'll bet
>dollars to doughnut holes that one day you will say "hmm...  perhaps Bob (and
>many others) was actually right here..."  :)


looks like you have run out of arguments if you can't counter my
"level-of-correctness" thing with anything better than "everybody has seen the
effect"!
you are getting very close to "i have seen this and therefore it must be this
way" that you-know-who always uses :-)

to state that _everybody_ has seen this is obviously wrong since you haven't
spoken to everybody, and mainly not to those who have written the best programs
out there, since they don't reveal their tricks - the commercials. one striking
example of a big eval discontinuity is published on ed schröder's page about
rebel: rebel goes from a complex king safety eval to ZERO king safety eval once
the queen goes off the board. now if that isn't an eval discontinuity, then i
don't know what an eval discontinuity is supposed to be. and it works for him,
it seems, and i think he's quite a trustworthy source of information when it
comes to chess programming. so are you of course, but it's not like everybody
seems to be in agreement with you here...

i'll reiterate my point of view: eval discontinuities *are* dangerous. they must
be well-tuned so that they don't produce unexpected and/or bad results. but they
are a real fact, well known to any human expert, and reflecting such a fact in
your eval can never be bad, if it is sufficently correct. whatever
"sufficiently" means...

perhaps it's practically impossible to write a "sufficiently correct" heuristic
eval, because it gets very difficult to get the tuning right. but from a
theoretical point of view, if you accept that tablebases are good for a program
you have absolutely no chance in this argument!

cheers
  martin



>
>>
>>
>>
>>
>>
>>
>>>  Giving the
>>>search another "horizon effect" opportunity begs for trouble.
>>>
>>>Do you realize why we all do the check, recapture, etc extensions?  To drive off
>>>horizon effects, correct?  But with an evaluation discontinuity, all the program
>>>has to do is to either try to push the transition over the horizon to avoid
>>>seeing it, or do everything possible to make it occur within the horizon, so
>>>that it gets the benefit.  And search extensions don't solve the problem because
>>>these "tricks" are not tactical in nature.
>>>
>>>If you don't believe they happen, fine.  Several of us that have tried it _know_
>>>that they do...
>>>
>>>There are some issues with egtbs, but they are only probed after captures, and
>>>since captures result in extensions, the likelihood of their causing mischief is
>>>lower, but not zero.  But with eval problems, the chances are way higher.
>>>
>>>Caveat Emptor...
>>>
>>>
>>>>
>>>>cheers
>>>>  martin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.