Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to judge?

Author: Vincent Diepeveen

Date: 13:01:24 12/27/99

Go up one level in this thread


On December 27, 1999 at 15:38:11, Dann Corbit wrote:

>On December 27, 1999 at 14:57:09, Ed Schröder wrote:
>[snip]
>How to know which is best?
>
>I think Dr. Hyatt's approach is a good one -- play a bazillion games on the net
>against quality opponents.  I see that Chris W. and Vincent D. have also
>followed this strategy.  Since the new improvements to Rebel also allow this

Partly true and partly wrong.

Dead wrong conclusion would be that i improve my program in order to blitz
better.
A few blitz game shows bugs in evaluation. however i feel blitz is
not relevant for my engine to measure it strength at these days.
The general problem of blitz games is that they go too fast.
I can't examine a 100 games each days!

I feel standard rated is a lot more important. However Bob success is so
massive that there play hundreds of crafties out there. So rating
is so much dependant upon how diep scores against the current crafty version
that it's hard to sometimes draw conclusions. Some people running
diep at icc especially for this reason put !computer therefore.

Secret has !computer but plays every computer unrated. Moron has
!computer only at blitz, otherwise it is at the interesting hours
only busy playing a thousand 3 0 games
against a dual crafty.

Despite its allowing all computers at all levels
unrated (and rated at standard), to my big surprise not many apart
from a few programmers/bookmakers match Moron with their program. The
vaste majority of operators seemingly only kick on their dicks height,
as they do usual find the quickest level that they can match DIEP at
running under judgeturpin (allowing rated against everyone, no rating
limits. 1100 rated sometimes fanatically play it a couple of games).

>kind of competition directly, I expect that you can gather a massive amount of
>data with free testers at will.  You can see how a change in Rebel performs
>against top computers.  You can see how a change in Rebel performs against top

don't expect a single 40 in 2 game though Ed in case you're interested...
...icc doesn't allow m moves in t time levels.

>humans.  I suggest you may write a parameter driven version of Rebel (or an
>engine that can write personalities to disk based upon a set of criteria) and
>then run one hundred games with the parameter at one setting, change the setting
>and run another hundred.  Using this sort of technique, you can find out what
>settings work best against various types of competition.  I think that will work
>very much better than your contest, since the attempts at producing good
>settings by others will be redundant and unscientific, for the most part.

I completely disagree here. 100 blitz games is not gonna show much.
apart from that you're dependant against who you play.


>By using the net as a resource, you double your compute power.  By selling
>copies of Rebel that can use the net as a resource you multiply your compute
>power by the number of sales (e.g. you can gather a huge number of games from
>the net and calculate strengths and weaknesses against rated opponents and you
>don't even have to run them).
>
>Suggestion:
>Have Rebel automatically annotate the network games with settings information so
>that you can glean the effectiveness of various settings.



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.