Author: Vasik Rajlich
Date: 16:22:38 12/13/05
Go up one level in this thread
On December 13, 2005 at 06:33:05, Kolss wrote: >On December 13, 2005 at 03:16:20, Vasik Rajlich wrote: > >>On December 13, 2005 at 02:16:50, Chrilly Donninger wrote: >> >>>I experimented recently with a Shredder-style search in Hydra. The >>>single-processor Shredder/Hydra completly demolished Shredder. If two programs >>>are similar, the strength difference is enlarged. Its therefore a bad idea to >>>tune a programme against itself. >>>But the Shredder/Hydra made only 40% against Rybka. Changing back to the >>>standard Hydra-search its between 75-80%. Rybka is regularily "killed" in >>>king-attacks. As noted before, this numbers are for Hydra-single-processor. The >>>PC-programm is running on a 3.2 MHz Pentium 4. Time control is 30secs/move. A >>>standard-opening set similar to the Nunn-openings is used. >>> >>>Changing the search is not only a tactical matter. The playing style is to a >>>large extend also influenced by the search. If two moves are from the evaluation >>>point similar, the programm usually plays the one with the larger search tree. >>>Or in other words: The lines which are extended. The Shredder/Hydra played >>>over-aggressive, whereas the classical Hydra with the right dose. >>> >>>One conclusion of my experiment is: Rybka seems to be fairly tuned against >>>Shredder. This is always the fate of the leader of the gang. In the future other >>>programs will be tuned against Rybka and it will be much more difficult to stay >>>on the top. >>> >>>The experiment shows also, that it is fairly easy to tune against one programm. >>>The problem is to find a solution which works against all. >>> >>>Chrilly >> >>Hi Chrilly, >> >>what you write is very interesting. I can add three things: >> > >Hi Vas, > >>1) 99% of Rybka testing was against Shredder 9. It was more of a practical >>thing, each previous version had a "rating" against Shredder. In the future we >>will be more complete of course. > >That is very interesting (:-) and makes it all the more astonishing that you >have managed to produce such a strong program! > I think the negative effect of this form of testing is pretty small. It's not perfect, but it's not that bad either. Vas >>2) I have also seen the phenomenon where differences are exaggerated against an >>older version. In particular, I have found that adding just one smaller but >>sound eval term often gives fairly nice scores against the old version while >>games against a completely different engine will sink into statistical noise. > >The same with Ikarus. I quite regularly produce versions which outscore the >standard version by 30 or even 60 ELO points in direct matches by just doing >some tweaking or adding or removing a term here or there. But in *most* cases, >these versions do worse (often by about 50 ELO points!) against a standard set >of different opponents. > >Self-play can give you a hint that you screwed something up completely, but it >is no serious way to measure progress / improvement. > >Best regards - Munjong.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.