Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to judge?

Author: Ed Schröder

Date: 01:38:07 12/28/99

Go up one level in this thread


>Posted by Vincent Diepeveen on December 27, 1999 at 15:48:18:
>
>>At least once a week I receive email in similar wordings. That's of course
>>very nice stuff to read but what puzzles me is Rebel's progress through the
>>years in this respect.
>>
>>I mean this: I am a 1800 player, very bad in tactics but with a positional
>>understanding of 2000, maybe a bit more. How to judge progress in Rebel's
>>positional understanding every time I add new chess knowledge?
>
>Statistically spoken you should not even near to 1500 Ed.
>
>Non actively chess playing people are hugely overestimating their
>chessstrength/insight.
>
>I can't find you at any dutch rating list,
>so my assumption is that you're one of those guys.

You can be right you also can be wrong. When I was 14-15 years I became
a member of a chess club and left after a few weeks because of all the
smoke (mainly cigars) and never returned to a chess club. I have played
several youth tournaments in The Hague (in the AEGON building) and
from that time is my estimated 1800 elo rating.


>Working on a chess engine sure doesn't improve playing strength,
>as you let the program solve stuff instead of solving it yourself.

Maybe this is true for you but it isn't for me. I can certainly say my
positional understanding has improved programming Rebel. My tactical
skills have gone down a lot because of laziness, lack of interest of
course due to the fact I have a program for that, why torture my brain
when a simple mouse click will do the job.


>If i have worked on diep in the afternoon, then in the evening i play
>like big shit to be objectively measuring what happens...
>
>>Has Rebel improved in playing humans since version 8,9,10 and now Rebel
>>Century? To answer this question precise you have to realize that hardware
>>has improved too during the years and people tend not to play old versions
>>which makes it even more difficult to judge its progress.
>
>Very accurate said. Apart from that another aspect needs not
>to be forgotten: i am used now to fight against crafty at duals
>and misssilicon at a K6-3, and Hossa at its latest hardware.
>
>If i then get against an oldie, i will suddenly do a lot better than
>i would have done in the past.
>
>So where humans have adjusted to the stronger programs, advances in
>theory, and some other things, the program is still showing the same
>performance.

If I understand you right you say that "comp-comp" is your only criterion
to judge Diep's progress? I did the very same in my early days but changed
that way of testing after 4-5 years.


>>Since times I use the following guide-line to decide which version is best:
>>- test sets (about 1000 positions) 30% as a first impression.
>
>ECM+BK nowadays?

My database is about 85% positional based positions and 15% are about
tactics. There are about 20-25 ECM positions. Don't know what BK means.


>>- auto232 results (30%)
>>- my personal impression based on my own style and feelings (40%) this
>>includes the GM challenge games as well.
>
>>How do other programmers decide which version is best? and maybe more
>>important which criteria is involved?

>
>I test carefully what the evaluation verbosely prints
>in a position where the bugfixes to the patterns applies to.
>
>Positions it played wrong in the past (5000 or something and growing
>each day nearly, but i only pick a few from which i think apply).

I do the same. And the most worse onces first.


>Then it's released to my testers and depending upon their results and my
>findings i fix bugs in it and decide where to expand again.

You are a reasonable chess player, 2200 I believe. What is your main
criterion to judge a version?


>When talking about a non-lineair change of search however i feel it's not
>so easy to decide.

Right, search solves many positional errors.


>Let's take for example last ply pruning. It's easy to make last ply pruning
>such that it does a lot better at testsets.
>
>But does it play better then?

Not in my opinion. It just scores better.


>I find that hard to judge. I have simply thrown all forward pruning
>out of DIEP and feel a lot more happier. It plays a lot better now,
>but has a way lower rating in blitz at single cpu machines at icc,
>the advantage in playing strength can be basically is in my opinion
>because of evaluation bugfixes.

You are absolutely right about pruning. The main change from Rebel7 to
Rebel8 was a very narrow selective search resulting in deep ply-depth's.
To my surprise the thing topped the SSDF list with +65 (or so) while
Rebel never had been a serious candidate on SSDF. But I also have seen
the other side of the medal, the holes because of selective search pruning
essential moves and losing games because of positional blunders. I changed
selective search a lot since version 8 fixing the holes losing some
ply-depth but I got back a lot more stable Rebel as a result.


>>I also am curious on opinions if Rebel Century is clearly better than let's
>
>>say Rebel8 when the subject is playing style which is something different
>>than playing strength (my opinion and view).
>
>I personally feel century is the same engine with a few more tactical
>extensions and a new book. So i see hardly difference, considering that
>tactical testsets like ECM, which were solved very bad by rebel8, do
>not get taken into account in my judgement of engine strength, as i found
>rebel8 already anything but tactical weak.

Well..... Rebel8 did very well on SSDF. If it was so bad in tactics you can't
enter the SSDF list as no.1 with +65 on no.2.

Maybe you can explain to me why ECM is so important for you? Do you use
all 600/700/800 positions or just a selection?

Ed




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.