Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Evaluation comparative test for Amateur Engines (PROPOSAL)

Author: Jaime Benito de Valle Ruiz

Date: 16:08:13 02/20/04

Go up one level in this thread


>It is interesting.
>
>But does comparing final eval-scores make sense, even with a normalized range?
>Programs, i mean search and eval are designed too differently.
>There is no or only vague information (piece interaction) about all the zillions
>of aspects/pattern and weights of the evaluation, only a final eval result.
>
>Eval scores of complex positions with a lot of important aspects are difficult
>to interprete. Eg. kingsafety issues for both sides, passers, "active piece
>play", static/dynamic pawn structures, unbalanced material, weak/strong squares
>and their interactions ...
>
>I prefere to discuss some eval aspects, interactions and vague implementation
>hints with concrete positions from time to time...
>
>Programmer don't like to share all their eval tricks for obvious reasons ...
>
>Cheers,
>Gerd

You're right.

I'm not asking for everyone to give out their "tricks", but to give some
figures, and help everyone else a "stand-pat" for their eval. If your engine
gave +2.00 in a situation, but all the strongest commercial engines gave
something around -1.00 for the same position... wouldn't you at least feel
interested about why the difference is so great? My idea is not to give an
ultimate score for a list of positions, but to find an automated way to find
"strange" disagreements in scores, and give you a chance to tweak your engine
more efficiently. I think that if we all contribute, we could easily come up
with a fairly big database of positions where lots of programmers can spot
serious flaws in specific positions for their evaluation functions.

If there's a serious disagreement in any particular position, an interesting
thread can be started about it, no doubt.

Non-profit contributions, and suggestions are more than welcome in this respect.
My engine is still far far away from most of the ones here, and I'd really pay
to have information such as this available to test my engine. I'm sure that
other people with much much better engines than mine wouldn't mind to give it a
try, just in case.

Maybe we could refine the test, so independant pieces of evaluation (such as
pawn structure, king's safety, etc...) can be included separately.

I'm talking about improving our resources, not giving out secrets.

Please give any suggestions you think are relevant; I'm hoping to learn from
you... as well as contributing in the future, if I can.

Regards,

  Jaime



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.