Author: Kolss
Date: 03:33:05 12/13/05
Go up one level in this thread
On December 13, 2005 at 03:16:20, Vasik Rajlich wrote: >On December 13, 2005 at 02:16:50, Chrilly Donninger wrote: > >>I experimented recently with a Shredder-style search in Hydra. The >>single-processor Shredder/Hydra completly demolished Shredder. If two programs >>are similar, the strength difference is enlarged. Its therefore a bad idea to >>tune a programme against itself. >>But the Shredder/Hydra made only 40% against Rybka. Changing back to the >>standard Hydra-search its between 75-80%. Rybka is regularily "killed" in >>king-attacks. As noted before, this numbers are for Hydra-single-processor. The >>PC-programm is running on a 3.2 MHz Pentium 4. Time control is 30secs/move. A >>standard-opening set similar to the Nunn-openings is used. >> >>Changing the search is not only a tactical matter. The playing style is to a >>large extend also influenced by the search. If two moves are from the evaluation >>point similar, the programm usually plays the one with the larger search tree. >>Or in other words: The lines which are extended. The Shredder/Hydra played >>over-aggressive, whereas the classical Hydra with the right dose. >> >>One conclusion of my experiment is: Rybka seems to be fairly tuned against >>Shredder. This is always the fate of the leader of the gang. In the future other >>programs will be tuned against Rybka and it will be much more difficult to stay >>on the top. >> >>The experiment shows also, that it is fairly easy to tune against one programm. >>The problem is to find a solution which works against all. >> >>Chrilly > >Hi Chrilly, > >what you write is very interesting. I can add three things: > Hi Vas, >1) 99% of Rybka testing was against Shredder 9. It was more of a practical >thing, each previous version had a "rating" against Shredder. In the future we >will be more complete of course. That is very interesting (:-) and makes it all the more astonishing that you have managed to produce such a strong program! >2) I have also seen the phenomenon where differences are exaggerated against an >older version. In particular, I have found that adding just one smaller but >sound eval term often gives fairly nice scores against the old version while >games against a completely different engine will sink into statistical noise. The same with Ikarus. I quite regularly produce versions which outscore the standard version by 30 or even 60 ELO points in direct matches by just doing some tweaking or adding or removing a term here or there. But in *most* cases, these versions do worse (often by about 50 ELO points!) against a standard set of different opponents. Self-play can give you a hint that you screwed something up completely, but it is no serious way to measure progress / improvement. Best regards - Munjong.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.