Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to judge?

Author: Ed Schröder

Date: 05:24:37 12/28/99

Go up one level in this thread


>Posted by Vincent Diepeveen on December 28, 1999 at 06:46:15:
>
>>You can be right you also can be wrong. When I was 14-15 years I became
>>a member of a chess club and left after a few weeks because of all the
>>smoke (mainly cigars) and never returned to a chess club. I have played
>>several youth tournaments in The Hague (in the AEGON building) and
>>from that time is my estimated 1800 elo rating.
>
>Let's see. 30-40 Years later nothing is left of course of such a rating.
>Till around 2200 missing tactics means you lose anyway.

This is not the point. The point is if a programmer is still able to
understand the moves its creation plays. For that you don't have to
be a tactical magician but some good positional skills will surely
help. I wasn't talking about winning / losing. And you are wrong about
my age too. My age is the elo difference between Rebel9 and Rebel8
estimated at ?? If less even better, you pick...

>>>Working on a chess engine sure doesn't improve playing strength,
>>>as you let the program solve stuff instead of solving it yourself.
>>
>>Maybe this is true for you but it isn't for me. I can certainly say my
>>positional understanding has improved programming Rebel. My tactical
>>skills have gone down a lot because of laziness, lack of interest of
>>course due to the fact I have a program for that, why torture my brain
>>when a simple mouse click will do the job.
>
>I must find the first exception to this. Especially someone who 'feels'
>he's still 1800 after 30 years.

You have a way with words Vincent.

>>>If i have worked on diep in the afternoon, then in the evening i play
>>>like big shit to be objectively measuring what happens...
>>>
>>>>Has Rebel improved in playing humans since version 8,9,10 and now Rebel
>>>>Century? To answer this question precise you have to realize that hardware
>>>>has improved too during the years and people tend not to play old versions
>>>>which makes it even more difficult to judge its progress.
>>>
>>>Very accurate said. Apart from that another aspect needs not
>>>to be forgotten: i am used now to fight against crafty at duals
>>>and misssilicon at a K6-3, and Hossa at its latest hardware.
>>>
>>>If i then get against an oldie, i will suddenly do a lot better than
>>>i would have done in the past.
>>>
>>>So where humans have adjusted to the stronger programs, advances in
>>>theory, and some other things, the program is still showing the same
>>>performance.
>>
>>If I understand you right you say that "comp-comp" is your only criterion
>>to judge Diep's progress? I did the very same in my early days but changed
>>that way of testing after 4-5 years.
>
>No i didn't say that. Biggest eval bugs are found usual because of human-comp.
>When humans play for their rating they're deadly accurate in trying to find
>a way to beat it :)
>Now in contradiction to rebel diep plays a couple of hundreds games each
>week against humans.
>
>However i must add to this that gross errors though they are in all programs,
>aren't the only thing you want to solve.
>
>>
>>>>Since times I use the following guide-line to decide which version is best:
>>>>- test sets (about 1000 positions) 30% as a first impression.

>>>
>>>ECM+BK nowadays?
>>
>>My database is about 85% positional based positions and 15% are about
>>tactics. There are about 20-25 ECM positions. Don't know what BK means.
>
>So you deny having tuned for BK (position 2... d4d5),
>and deny knowing wat BK is?

I could guess, Bratco-Kopec? If so I don't use it.

>Like you never read JICCA, you never read 'computerschaak' and many
>other magazines?

Hardly.

>Hard for me to believe!

Then don't.

>>>>- auto232 results (30%)
>>>>- my personal impression based on my own style and feelings (40%) this
>>>>includes the GM challenge games as well.
>>>
>>>>How do other programmers decide which version is best? and maybe more
>>>>important which criteria is involved?
>>
>>>
>>>I test carefully what the evaluation verbosely prints
>>>in a position where the bugfixes to the patterns applies to.
>>>
>>>Positions it played wrong in the past (5000 or something and growing
>>>each day nearly, but i only pick a few from which i think apply).
>>
>>I do the same. And the most worse onces first.
>>
>>
>>>Then it's released to my testers and depending upon their results and my
>>>findings i fix bugs in it and decide where to expand again.
>>
>>You are a reasonable chess player, 2200 I believe. What is your main
>>criterion to judge a version?
>
>What you call 30% feeling is for me actually 100% feeling,
>as in the end i'm the judge of everything concerning DIEP.
>I must feel happy with how it plays, not the icc folks.

Cool.

>>>When talking about a non-lineair change of search however i feel it's not
>>>so easy to decide.
>>
>>Right, search solves many positional errors.
>>>Let's take for example last ply pruning. It's easy to make last ply pruning
>>>such that it does a lot better at testsets.
>>>
>>>But does it play better then?
>>
>>Not in my opinion. It just scores better.
>
>We disagree here. it scores better in blitz. it doesn't play very
>well in slow games.

You misunderstood. It's what I said too. It scores better in tactical sets but
not in games.

>>>I find that hard to judge. I have simply thrown all forward pruning
>>>out of DIEP and feel a lot more happier. It plays a lot better now,
>>>but has a way lower rating in blitz at single cpu machines at icc,
>>>the advantage in playing strength can be basically is in my opinion
>>>because of evaluation bugfixes.
>>
>>You are absolutely right about pruning. The main change from Rebel7 to
>>Rebel8 was a very narrow selective search resulting in deep ply-depth's.
>
>I had the impression you also added lazy evaluation to rebel8?

Lazy Eval (my own modified version of LE) is in Rebel since times I
believe it goes back to Mephisto Polgar (or so).

>>To my surprise the thing topped the SSDF list with +65 (or so) while
>>Rebel never had been a serious candidate on SSDF. But I also have seen
>>the other side of the medal, the holes because of selective search pruning
>>essential moves and losing games because of positional blunders. I changed
>>selective search a lot since version 8 fixing the holes losing some
>>ply-depth but I got back a lot more stable Rebel as a result.
>>
>>
>>>>I also am curious on opinions if Rebel Century is clearly better than let's
>>>
>>>>say Rebel8 when the subject is playing style which is something different
>>>>than playing strength (my opinion and view).
>>>
>>>I personally feel century is the same engine with a few more tactical
>>>extensions and a new book. So i see hardly difference, considering that
>>>tactical testsets like ECM, which were solved very bad by rebel8, do
>>>not get taken into account in my judgement of engine strength, as i found
>>>rebel8 already anything but tactical weak.
>>
>>Well..... Rebel8 did very well on SSDF. If it was so bad in tactics you can't
>>enter the SSDF list as no.1 with +65 on no.2.
>
>Right when it came out it was tactical the best.

Disagree.

>However your trick to abort double games with rebel8
>is hugely underestimated worldwide, especially
>if you combine that with a book which in some pathetic side lines
>which rebel plays well there are a lot of lines int he rebelbook
>itself.

Making new friends?

>>Maybe you can explain to me why ECM is so important for you? Do you use
>>all 600/700/800 positions or just a selection?
>
>It obviously is more important to you, as new rebel versions suddenly
>solve ECM positions real soon which rebel8 didn't solve within a
>minute or 10 at least. Dieps behaviour on ECM hardly has changed.

ECM is not important for me as it is only about tactics. I have enough of
these.

Ed

>
>Vincent




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.