Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Different Hydra personalities against Rybka

Author: Vasik Rajlich
Date: 16:22:38 12/13/05
On December 13, 2005 at 06:33:05, Kolss wrote:

>On December 13, 2005 at 03:16:20, Vasik Rajlich wrote:
>
>>On December 13, 2005 at 02:16:50, Chrilly Donninger wrote:
>>
>>>I experimented recently with a Shredder-style search in Hydra. The
>>>single-processor Shredder/Hydra completly demolished Shredder. If two programs
>>>are similar, the strength difference is enlarged. Its therefore a bad idea to
>>>tune a programme against itself.
>>>But the Shredder/Hydra made only 40% against Rybka. Changing back to the
>>>standard Hydra-search its between 75-80%. Rybka is regularily "killed" in
>>>king-attacks. As noted before, this numbers are for Hydra-single-processor. The
>>>PC-programm is running on a 3.2 MHz Pentium 4. Time control is 30secs/move. A
>>>standard-opening set similar to the Nunn-openings is used.
>>>
>>>Changing the search is not only a tactical matter. The playing style is to a
>>>large extend also influenced by the search. If two moves are from the evaluation
>>>point similar, the programm usually plays the one with the larger search tree.
>>>Or in other words: The lines which are extended. The Shredder/Hydra played
>>>over-aggressive, whereas the classical Hydra with the right dose.
>>>
>>>One conclusion of my experiment is: Rybka seems to be fairly tuned against
>>>Shredder. This is always the fate of the leader of the gang. In the future other
>>>programs will be tuned against Rybka and it will be much more difficult to stay
>>>on the top.
>>>
>>>The experiment shows also, that it is fairly easy to tune against one programm.
>>>The problem is to find a solution which works against all.
>>>
>>>Chrilly
>>
>>Hi Chrilly,
>>
>>what you write is very interesting. I can add three things:
>>
>
>Hi Vas,
>
>>1) 99% of Rybka testing was against Shredder 9. It was more of a practical
>>thing, each previous version had a "rating" against Shredder. In the future we
>>will be more complete of course.
>
>That is very interesting (:-) and makes it all the more astonishing that you
>have managed to produce such a strong program!
>

I think the negative effect of this form of testing is pretty small. It's not
perfect, but it's not that bad either.

Vas

>>2) I have also seen the phenomenon where differences are exaggerated against an
>>older version. In particular, I have found that adding just one smaller but
>>sound eval term often gives fairly nice scores against the old version while
>>games against a completely different engine will sink into statistical noise.
>
>The same with Ikarus. I quite regularly produce versions which outscore the
>standard version by 30 or even 60 ELO points in direct matches by just doing
>some tweaking or adding or removing a term here or there. But in *most* cases,
>these versions do worse (often by about 50 ELO points!) against a standard set
>of different opponents.
>
>Self-play can give you a hint that you screwed something up completely, but it
>is no serious way to measure progress / improvement.
>
>Best regards - Munjong.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.