Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to judge?

Author: Vincent Diepeveen

Date: 12:48:18 12/27/99

Go up one level in this thread


On December 27, 1999 at 14:57:09, Ed Schröder wrote:

>How to judge?
>
>>Posted by walter irvin on December 27, 1999 at 12:26:07:
>
>>rebel century is extremely strong ,i dont play it as much as some of the other
>>ones i have for 2 reasons .1.rebel never lets me get anything going unlike
>>hiarcs,fritz,crafty ect . vs them i feel like i have a chance till the end .2.
>>rebels style is devoid of brilliat moves or moves that ever leave the position
>>unclear .instead its like the program has a roll of duct tape and he slowly
>>but surely wraps you up till you have no options left .for me rebel is the >hardest to beat ,in fact so far its the only one i have not beat .
>
>At least once a week I receive email in similar wordings. That's of course
>very nice stuff to read but what puzzles me is Rebel's progress through the
>years in this respect.
>
>I mean this: I am a 1800 player, very bad in tactics but with a positional
>understanding of 2000, maybe a bit more. How to judge progress in Rebel's
>positional understanding every time I add new chess knowledge?

Statistically spoken you should not even near to 1500 Ed.

Non actively chess playing people are hugely overestimating their
chessstrength/insight.

I can't find you at any dutch rating list,
so my assumption is that you're one of those guys.

Working on a chess engine sure doesn't improve playing strength,
as you let the program solve stuff instead of solving it yourself.

If i have worked on diep in the afternoon, then in the evening i play
like big shit to be objectively measuring what happens...

>Has Rebel improved in playing humans since version 8,9,10 and now Rebel
>Century? To answer this question precise you have to realize that hardware
>has improved too during the years and people tend not to play old versions
>which makes it even more difficult to judge its progress.

Very accurate said. Apart from that another aspect needs not
to be forgotten: i am used now to fight against crafty at duals
and misssilicon at a K6-3, and Hossa at its latest hardware.

If i then get against an oldie, i will suddenly do a lot better than
i would have done in the past.

So where humans have adjusted to the stronger programs, advances in
theory, and some other things, the program is still showing the same
performance.

>Since times I use the following guide-line to decide which version is best:
>- test sets (about 1000 positions) 30% as a first impression.

ECM+BK nowadays?

>- auto232 results (30%)
>- my personal impression based on my own style and feelings (40%) this
>includes the GM challenge games as well.

>How do other programmers decide which version is best? and maybe more
>important which criteria is involved?

I test carefully what the evaluation verbosely prints
in a position where the bugfixes to the patterns applies to.

Positions it played wrong in the past (5000 or something and growing
each day nearly, but i only pick a few from which i think apply).

Then it's released to my testers and depending upon their results and my
findings i fix bugs in it and decide where to expand again.

When talking about a non-lineair change of search however i feel it's not
so easy to decide.

Let's take for example last ply pruning. It's easy to make last ply pruning
such that it does a lot better at testsets.

But does it play better then?

I find that hard to judge. I have simply thrown all forward pruning
out of DIEP and feel a lot more happier. It plays a lot better now,
but has a way lower rating in blitz at single cpu machines at icc,
the advantage in playing strength can be basically is in my opinion
because of evaluation bugfixes.

>I also am curious on opinions if Rebel Century is clearly better than let's
>say Rebel8 when the subject is playing style which is something different
>than playing strength (my opinion and view).

I personally feel century is the same engine with a few more tactical
extensions and a new book. So i see hardly difference, considering that
tactical testsets like ECM, which were solved very bad by rebel8, do
not get taken into account in my judgement of engine strength, as i found
rebel8 already anything but tactical weak.

>Ed

Vincent



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.