Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Different Hydra personalities against Rybka

Author: Zappa

Date: 19:30:26 12/13/05

Go up one level in this thread


On December 13, 2005 at 19:26:04, Vasik Rajlich wrote:

>On December 13, 2005 at 07:13:16, Joachim Rang wrote:
>
>>On December 13, 2005 at 07:06:35, Tord Romstad wrote:
>>
>>>On December 13, 2005 at 06:33:05, Kolss wrote:
>>>
>>>>The same with Ikarus. I quite regularly produce versions which outscore the
>>>>standard version by 30 or even 60 ELO points in direct matches by just doing
>>>>some tweaking or adding or removing a term here or there. But in *most* cases,
>>>>these versions do worse (often by about 50 ELO points!) against a standard set
>>>>of different opponents.
>>>
>>>You see this in *most* cases?  Strange.  I have *never* seen this happen.
>>>Like everybody else, I have often experienced that a version which seems to
>>>be much stronger in self-play matches is only a tiny bit stronger against
>>>other engines, and also that there is no measurable difference at all.  I
>>>have never seen the version which is stronger in self-play perform measurably
>>>*worse* against other programs, however.
>>>
>>
>>same here
>>
>
>In theory it could happen. If it would be impossible, then all testing would be
>self-play (unless I miss something).
>
>>>>Self-play can give you a hint that you screwed something up completely, but it
>>>>is no serious way to measure progress / improvement.
>>>
>>>For me, self-play is useful as a first, quick test.  Because my experience
>>>is that self-play tends to exaggerate the difference between two program
>>>versions, it enables me to detect tiny improvements more rapidly.  If the
>>>results of my self-play matches are promising, I proceed to test against
>>>other engines.
>>>
>>>Tord
>>
>
>Self-play also gives you "denser" results - you're dealing with one uncertainty
>figure rather than two. It's a good way for example to make sure that a change
>you have a lot of confidence in doesn't have some unexpected effect.
>
>Vas
>
>>same here
>>
>>:-)

Wow, you guys test?  I never really keep track of results, I just play a game
now and again and fix moves that I don't like.

anthony



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.